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RECONSTITUTED POLYPEPTIDES 

Portions of the present invention were made with support of the United 
States Government via a grant from the National Institutes of Health under grant 
numbers R29-GM 55042 and R01-DK63090, and via a grant from the 
Department of Defense under grant number DAMD17-01-1-0385. The U.S. 
Government therefore may have certain rights in the invention. 

Claim of Priority 

This application claims priority under 35 U.S.C. § 1 1 9(e) from U.S. 
Provisional Application Serial No. 60/386,991, filed June 6, 2002 which 
application is incorporated herein by reference. 

BACKGROUND OF THE INVENTION 

U.S. Patent Applications Serial Nos. 09/096,749, which corresponds to 
Publication No. US 2002 0019517, and 09/903,412 are hereby incorporated by 
reference in their entirety. 



Antibody structure 

A standard antibody (Ab) is a tetrameric structure consisting of two 
identical immunoglobulin (Ig) heavy chains and two identical light chains. The 
heavy and light chains of an Ab consist of different domains. Each light chain 

25 has one variable domain (VL) and one constant domain (CL), while each heavy 
chain has one variable domain (VH) and three or four constant domains (CH) 
(Alzari etal, 1988). Each domain, consisting of -110 amino acid residues, is 
folded into a characteristic P-sandwich structure formed from two p-sheets 
packed against each other, the immunoglobulin fold. The VH and VL domains 

30 each have three complementarity determining regions (CDR1-3) that are loops, 
or turns, connecting P-strands at one end of the domains (Fig. 1 : A, C). The 
variable regions of both the light and heavy chains generally contribute to 
antigen specificity, although the contribution of the individual chains to 
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specifid^i SB o.^e qU a ) .Anttbodymo 1 e al leshave™ 1 ved.obM.oa 
tagenumberofmolec^by^ngsixrand^zed.oopsCCDRs). Hoover, 

the size of the rid. and the complexity of six loops represents amajor 
design hurdle if the end result is to be a relatively small peptide hgand. 

5 

AnHhndv sub structures 

Functional substructures of Abs can be prepared by proteolysis and by 
recombinant methods. They include me Fab ftagmen^wbich contains the VH- 
CH1 domains of the heavy chain and the VWJJ domains of the light chau, 
10 jomedhyasmglein.erchaindisulfidebond.and.heFvWwHchconmms 

onlytheVHandVLd^nains. In some cases, a single VH domain rrtains 
si^cantaffinityCWarde^., 1989). It has also been shown that a certam 
monomericKlight chain wulspecificaUybind to its cognate antigen. (L Masat 
e,ai 1994). Separated light or heavy chains have sometimes been found to 

their size, low solubility or low conformational stability. 

Another factional substructure is a single chain Fv (scFv), made of the 
variableregionsofmeimmuno^obulmheavy and tight chain, covalently 
20 coBnec,edbyapep«idelin k er(S-zHue« < r/.,1996). These small (M, 25,000) 
proteins generally retain specificity and affinity for antigen in a smgle 
polypeptide and can provide a convenient building block for larger, antigen- 
ic molecules. Several groups have reported biodistribution studies . 
xenograft^ athymic mice using scFv reactive against a variety of tumor 
25 antigens^whichspedfictumorlocaUzationhasbeenobserved. However me 

short persistence of scFvs in the chelation limits the exposure of tumor cells to 
mescFvs,plactoglinu b onmelevelof«p^e.Asaresul^ti»noruptakeby 

scFvs in animal studies has generally been only 1 -5%ID/g as opposed to rntact 

30 high as 60-70 %ID/g. 

A small protein scaffold called a "rmmbody" was designed using a part of 
the Ig VH domain as the template (Pessi et al., 1993). Minibodies with high 
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affinity (dissociation\ constant (Kj) - 10" 7 M) to interleukin-6 were identified by 
randomizing loops corresponding to CDR1 and CDR2 of VH and then selecting 
mutants using the phage display method (Martin et aL 9 1994). These 
experiments demonstrated that the essence of the Ab function could be 
5 transferred to a smaller system. However, the minibody had inherited the limited 
solubility of the VH domain (Biinchi et aL, 1994). 

It has been reported that camels (Camelus dromedarins) often lack variable 
light chain domains when IgG-like material from their serum is analyzed, 
suggesting that sufficient antibody specificity and affinity can be derived form 

1 0 VH domains (three CDR loops) alone. Davies and Riechmann recently 

demonstrated that "camelized" VH domains with high affinity (K^ ~ 10" 7 M) and 
high specificity can be generated by randomizing only the CDR3. To improve 
the solubility and suppress nonspecific binding, three mutations were introduced 
to the framework region (Davies & Riechmann, 1995). It has not been 

1 5 definitively shown, however, that camelization can be used, in general, to 
improve the solubility and stability of VHs. 

An alternative to the "minibody" is the "diabody." Diabodies are small 
bivalent and bispecific antibody fragments, Le., they have two antigen-binding 
sites. The fragments contain a heavy-chain variable domain (V^) connected to a 

20 light-chain variable domain (VJ on the same polypeptide chain (V H -V,3. 

Diabodies are similar in size to an Fab fragment. By using a linker that is too 
short to allow pairing between the two domains on the same chain, the domains 
are forced to pair with the complementary domains of another chain and create 
two antigen-binding sites. These dimeric antibody fragments, or "diabodies," are 

25 bivalent and bispecific (P. Holliger et al , 1 993). 

Since the development of the monoclonal antibody technology, a large 
number of 3D structures of Ab fragments in the complexed and/or free states 
have been solved by X-ray crystallography (Webster et ai 9 1994; Wilson & 
Stanfield, 1994). Analysis of Ab structures has revealed that five out of the six 

30 CDRs have limited numbers of peptide backbone conformations, thereby 
permitting one to predict the backbone conformation of CDRs using the so- 
called canonical structures (Lesk & Tramontano, 1992; Rees eta!., 1994). The 
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analysis also has revealed that the CDR3 of the VH domain (VH-CDR3) usually 
has the largest contact surface and that its conformation is too diverse for 
canonical structures to be defined; VH-CDR3 is also known to have a large 
variation in length (Wu et al, 1993). Therefore, the structures of crucial regions 
5 of the Ab-antigen interface still need to be experimentally determined. 

Comparison of crystal structures between the free and complexed states has 
revealed several types of conformational rearrangements. They include side- 
chain rearrangements, segmental movements, large rearrangements of VH-CDR3 
and changes in the relative position of the VH and VL domains (Wilson & 
10 Stanfield, 1993). In the free state, CDRs, in particular those which undergo large 
conformational changes upon binding, are expected to beflexible. Since X-ray 
crystallography is not suited for characterizing flexible parts of molecules, 
structural studies in the solution state have not been possible to provide dynamic 
pictures of the conformation of antigen-binding sites. 

15 

^vKminLin p the antf H"^Y- hinding site 

CDR peptides and organic CDR mimetics have been made (Dougall et al, 
1994). CDR peptides are short, typically cyclic, peptides which correspond to 
the amino acid sequences of CDR loops of antibodies. CDR loops are 
20 responsibleforantibody-antigeninteractions. Organic CDR mimetics are 
peptides corresponding to CDR loops which are attached to a scaffold, eg., a 

small organic compound. 

CDR peptides and organic CDR mimetics have been shown to retain some 
binding affinity (Smym & von Itzstein, 1994). However, as expected, they are 
25 too small and too flexible to maintain full affinity and specificity. Mouse CDRs 
have been grafted onto the human Ig framework without the loss of affinity 
(Jones et al, 1986; Riechmann et al, 1988), though this "humanization" does 
not solve the above-mentioned problems specific to solution studies. 

30 A/fimirking nat"™l ^Wtion processes of Abs 

m the immune system, specific Abs are selected and amplified from a large 
library (affinity maturation). The processes can be reproduced in vitro using 
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combinatorial library technologies. The successful display of Ab fragments on 
the surface of bacteriophage has made it possible to generate and screen a vast 
number of CDR mutations (McCafferty et al, 1990; Barbas et al, 1991; Winter 
et aL, 1994). An increasing number of Fabs and Fvs (and their derivatives) is 
5 produced by this technique, providing a rich source for structural studies. The 
combinatorial technique can be combined, with Ab mimics. 

A number of protein domains that could potentially serve as protein 
scaffolds have been expressed as fusions with phage capsid proteins. Review in 
Clackson & Wells, Trends Biotechnol. 12:173-184 (1994). Indeed, several of 

10 these protein domains have already been used as scaffolds for displaying random 
peptide sequences, including bovine pancreatic trypsin inhibitor (Roberts et aL, 
PNAS 89:2429-2433 (1992)), human growth hormone (Lowman et al, 
Biochemistry 30: 10832-10838 (1991)), Venturini et al, Protein Peptide Letters 
1:70-75 (1994)), and the IgG binding domain of Sfreptococcus (O'Neil et aL 9 

1 5 Techniques in Protein Chemistry V (Crabb, L,. ed.) pp. 5 1 7-524, Academic 
Press, San Diego (1994)). These scaffolds have displayed a single randomized 
loop or region. 

Researchers have used the small 74 amino acid a-amylase inhibitor 
Tendamistat as a presentation scaffold on the filamentous phage M13 

20 (McConnell and Hoess, 1995). Tendamistat is a P-sheet protein from 

Sti-eptomyces tendae. It has a number of features that make it an attractive 
scaffold for peptides, including its small size, stability, and the availability of 
high resolution NMR and X-ray structural data. Tendamistat' s overall topology 
is similar to that of an immunoglobulin domain, with two p-sheets connected by 

25 a series of loops. In contrast to immunoglobulin domains, the P-sheets of 
Tendamistat are held together with two rather than one disulfide bond, 
accounting for the considerable stability of the protein. By analogy with the 
CDR loops found in immunoglobulins, the loops the Tendamistat may serve a 
similar function and can be easily randomized by in vitro mutagenesis. 

30 Tendamistat, however, is derived from Streptomyces tendae. Thus, while 

Tendamistat may be antigenic in humans, its small size may reduce or inhibit its 
antigenicity. Also, Tendamistat' s stability is uncertain. Further, the stability that 
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^por.edforTeadanu^battnbu.ed.omepresenceoftwodi.uifidcbonds. 

D^bondaowever.areas.^cantdisadv^e.osuch^esm 

toihey^bebrokenunderreducingcondifionsandmustbeproprfyfonned 

tooriertohaveauseMprotemstructure. Further, the size of the loops m 
5 Te ndamista. are relatively small, thus Hating the size of fte mserts that can be 
accomodated in me soaffold. Moreover, it is weU known that formmgcorrec. 
disul flde bonds in »ew ly synthesized peptides is no. straightforward. When a 
protein is expressed in the cytoplasmic space of* «*, memost common host 
hactcriumfor protein overexpression, disulfide bonds are, sually not onned, 
10 po^ttallymaktagitdifflcnlttopreparelargequantinesofengmeered 

molecules. 

Thus, ftere is an on-going need for smaU polypeptides ma. btnd wuh a 
^ molecule, such as an artificial antibody. These polypepud* can be used 

15 isalsoanon-goingneedforpolypepndesftatbind^moreuranonetarge. 
molecule, and for protein fragments (or binding pairs) ft* associate or 

reconstitute to form a protein. 

The following abbreviations have been used in describing ammo acrds, 

20 Lpaxagin^ 

UuorL 5 leucine;LysorK,lysine;MetorM 5 methionine;PheorF, 

phenylalanine;Proor^^^^^^ 

tryptophan; Tyr or Y, tyrosine; Val or V, valine. 
25 UefollowmgabbrevianonshavebeenusedindescnbingnucleKa^^ 

DNA, or RNA: A, adenosine; T, thymidine; G, guanosine; C, cytosme. 

SUMMARY OF THE INVENTION 

carries the meaning of "one 



As used herein the indefinite article "a' or an' 
30 or more " 



Reoonstitu.ionofapro.ein is where two polypeptide fragments froma 
single original protein are bound togefter, though not necessarily by covalent 
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bonding. "Association" is where two polypeptides fragments from the same or 
different starting proteins are bound together. Again, the binding is not 
necessarily by covaient bonding. For a general discussion of protein 
reconstitution/reassociation, see Ojennus et al (2001). Fragments of Fn3 will 
5 reconstitute and/or reassociate at a pH range of between pH 1 and pH 10 at 30 
°C. At neutral pH> they will still reassociate at 50 °C. 

A "coiled coil" is a widespread structural motif that is found in fibrous 
proteins such as myosin and keratin. A coiled coil constitutes two or more 
interacting a-helices, supercoiled around one another, that are associated in a 

10 parallel or an anti-parallel orientation. The a-helices of naturally occurring 

coiled coils are generally parallel. Sequence features within a natural coiled coil 
can lead to preference for an antiparallel helix orientation rather than the more 
commonly observed parallel alignment. See, Oakley and Kim, Biochemistry 
37:12603-12610 (1998) for a detailed discussion of coiled coils. Another 

1 5 binding pair that could be used to encourage reassociation of two fragments is 
the intein system, described by Yamazaki et al (1998). 

The present invention provides an Fn3 monobody binding pair. The 
binding pair is made up of two parts, a first Fn3 monobody polypeptide having 
two to six p-strand domains (which optionally has a polypeptide tail region 

20 attached to one or both of the terminal p-strand domains) with a loop region 
linked between each P-strand domain, and a second Fn3 monobody polypeptide 
having two to six P-strand domains (which optionally has a polypeptide tail 
region attached to one or both of the terminal P-strand domains) with a loop 
region linked between each p-strand domain. A "polypeptide tail region" is a 

25 polypeptide that is one to 25 amino acids in length that is not part of the p-strand 
domain. A "terminal p-strand" is one of the two P-strands in the monobody that 
is bound to a loop region at only one of its ends. For example, in a monobody 
that has three p-strands and two loop regions, one would have a first terminal p- 
strand, a loop region, an internal p-strand, a loop region, and then a second 

30 terminal p-strand. Thus, the terminal P-strands are linked to only one loop 
region, whereas the internal P-strand is linked at both ends of the p-strand. 
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Uefo.FnSft^en.a.sodateswiihftese.ondF^^entwi.ha 

dissociation constant ot less m<ui the 

Leotide has two 0-strand to-* » — * . 
polypeptide has two p „h» h^s three p-sttand domatas, it contains 

polypep^ifthentonobodypo^epudhastoeP 
^.dhavethtee^p^onsinuie^ypeptide,^ AUeaslo » 

The present invention further pnmu 

Moore^etermtoed peptides are generated. It is well mo 

„ he died with spedfic chemical reagents (e.g., cyanogen 
^encescanbedeavedw. P ^^.facorX, tobacco etch 

bromide) or with a proteases (e.g., thrombin, 
virusCrF^pro.ease.humanrhinovirnsJCprote.se). ^.Crdghlon 

. m-^nles/liter In one embodiment, an auxiliary region is a cy 
25 less than 10 moles/liter, in - c , first cysteine and the second 

■a Porexamole the first auxiliary region is a first cysteine 
residue. For example, in gecond 
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phage display or other methods. Once one found desired monobody 
heterodimers (i.e., specific pairs of fragments), they would be reformatted into 
uncut, full-length proteins. Thus, disulfide-linked monobodies are instead very 
useful vehicles for library construction, even though disulfide linkages are not 

5 present in a final product. Cysteine residues may also be present in the loop 
regions and/or the p-strand regions. 

In other embodiments, the auxiliary domains are a natural protein/peptide 
pair, a peptide-binding protein and its target peptide, or two fragments of a , 
protein that have been artificially generated. Examples include coiled coils, or a 

1 0 C-intein and N-intein pair. 

The present invention further provides a fibronectin type IE (Fn3) 
monobody binding pair having two parts: a first fibronectin type III (Fn3) 
monobody polypeptide containing two to six P-strand domains with a loop 
region linked between each p-strand domain, wherein a polypeptide tail region is 

15 attached to one or both terminal p-strands, and a second Fn3 monobody 

polypeptide containing two to six P-strand domains with a loop region linked 
between each p-strand domain, wherein a polypeptide tail region is attached to 
one or both terminal p-strands, wherein the first Fn3 fragment associates with the 
second Fn3 fragment with a dissociation constant of less than 10" 6 moles/liter. 

20 The present invention provides variegated nucleic acid libraries encoding 

Fn3 monobody polypeptides, where one or more of the loop regions of the 
monobody polypeptides can be modified by insertions, deletions or substitutions. 
The present invention also provides polypeptide libraries made from these 
nucleic acid libraries. 

25 The present invention provides a fibronectin type HI (Fn3) monobody 

polypeptide made of two to six p-strand domains with a loop region linked 
between each p-strand domain. The monobody polypeptide is capable of 
binding to a target molecule with a dissociation constant of less than 10" 6 
moles/liter. 

30 

BRIEF DESCRIPTION OF THE DRAWINGS 
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Figure 1. (J-Strand and loop topology (A, B) and MOLSCRIPT 
mentation (C D; Krauiis, 1991) of the VH domain of anti-lysozyme 
ilunoglobulinDl.3 (A, C;Bhat«u,., 1994) and 1 0th type m dotna.no 

5 determinin g^CCDRs.hype^le^andthen.te^—Ars- 

Gly-Asp (RGD) sequence ate indicated. 

T le2. AminoacidseuuenceCSEQmNO-.l^andresmd.onst.e.of 

fte synthetic Fn3 gene. The residue numbering is acting. o Main 
(,992) Restriction enzyme sites designed are shown above the ammo add 

hasbeenadded for a subset cloning into an express.cn vector. The H.s tag 
(Novagen) fusion protein has an additional sequence, 
MGSSHHHHHHSSGLVPRGSH (SEQ ID NO:l 14), preceding the Fn3 

sequence shown above. 
15 Fi-ureS. A ,FarUVCDspec tt aof W ild-ty,e F n3a t 25»Cand9 »C. Fn3 

(5 0u^wasdissolved i nsodinrnacefa.e(50rnM,pH4.6). B,ti.ermal 
denaturanon of Fn3 monitored at 2,5 ran. Temperature was increased at a rate 

ofl°C/min. 

Figur e 4. A, 0. traceof the crystal srmcrure of me complex of lysozyme 
20 (HEL) and the Fv (ragmen, of the anti-hen egg-white lysozyme (anti-HEL) 
Ltib dyD..3(Bhat e(a ,.994, — 
CDR3 .hichmakecontactwiflrHEUarealao shown. B, Contact surface area 

fa ea* residue of the Dl .3 VH-HEL and VH-VL interactions plotted vs. 

residue number of Dl .3 VH. Surface area and secondary structure were 
23 deterntin«*eprogramDSSP (Kabsn and Sander, 19S3). C andD, 
schematic drawings of the M>e« struck of me F stiand-ioop-G strand 
moietiesof DL3 VH (C) and Fn3 (D). Theboxes denote res.dues m p-strands 
lo^moseno.instrands. The shaded boxes indicate residues ofw.chs.de 

^aresig— .ybuned. ^^^L. 
,0 Figures. Designed Fn3 gene showmgDNA (SEQ ID NO.llt, 

acid (SEQ © N 0:l 12) sequent The amino acid numbering is accordmg to 
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Main et al (1992). The two loops that were randomized in combinatorial 
libraries are enclosed in boxes. \ 

Figure 6. Map of plasmid pAS45. Plasmid pAS45 is the expression vector 
ofHis-tag-Fn3. 

5 Figure 7. Map of plasmid pAS25. Plasmid pAS25 is the expression vector 

ofFn3. 

Figure 8. Map of plasmid pAS38. pAS38 is a phagmid vector for the 
surface display of Fn3. 

Figure 9. (Ubiquitin-1) Characterization of ligand-specific binding of 

10 enriched clones using phage enzyme-linked immunosolvent assay (ELISA). 
Microtiter plate wells were coated with ubiquitin (1 |ig/well; "Ligand (+)) and 
then blocked with BSA. Phage solution in TBS containing approximately 10 10 
colony forming units (cfu) was added to a well and washed with TBS. Bound 
phages were detected with anti-phage antibody-POD conjugate (Pharmacia) with 

15 Turbo-TMB (Pierce) as a substrate. Absorbance was measured using a 

Molecular Devices SPECTRAmax 250 microplate spectrophotometer. For a 
control, wells without the immobilized ligand were used. 2-1 and 2-2 denote 
enriched clones from Library 2 eluted with free ligand and acid, respectively. 4- 
1 and 4-2 denote enriched clones from Library 4 eluted with free ligand and acid, 

20 respectively. 

Figure 10. (Ubiquitin-2) Competition phage ELISA of enriched clones. 
Phage solutions containing approximately 10 10 cfu were first incubated with free 
ubiquitin at 4°C for 1 hour prior to the binding to a ligand-coated well. The 
wells were washed and phages detected as described above. 
25 Figure 11. Competition phage ELISA of ubiquitin-binding monobody 41 1 . 

Experimental conditions are the same as described above for ubiquitin. The 
ELISA was performed in the presence of free ubiquitin in the binding solution. 
The experiments were performed with four different preparations of the same 
clone. 

30 Figure 12. (Fluorescein- 1) Phage ELISA of four clones, Plb25.1 

(containing SEQ ID NO:l 15), Plb25.4 (containing SEQ ID NO:l 16), pLB24.1 
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(containing SEQ ID NO:l 17) and P LB24.3 (containing SEQ ID NO:l 18). 
Experimental -conditions are the same as ubiquitin-1 above. 

Figure 13. (Fluorescein-2) Competition ELISA of the four clones (SEQ ID 
Nos:l 15-1 18). Experimental conditions are the same as ubiquitin-2 above. 
5 Figure 14. 'H, ,S N-HSQC spectrum of a fluorescence-binding monobody 
LB25.5. Approximately 20 jxM protein was dissolved in 10 mM sodium acetate 
buffer (pH 5.0) containing 100 mM sodium chloride. The spectrum was 
collected at 30'C on a Varian Unity INOVA 600 NMR spectrometer. 

Figure 15. Characterization of the binding reaction of Ubi4-Fn3 to the 
10 target, ubiquitin. (a) Phage ELISA analysis of binding of Ubi4-Fn3 to ubiquitin. 
The binding of Ubi4-phages to ubiquitin-coated wells was measured. The 
control experiment was performed with wells containing no ubiquitin. 

(b) Competition phage ELISA of Ubi4-Fn3. Ubi4-Fn3-phages were 
preincubated with soluble ubiquitin at an indicated concentration, followedby 

15 the phage ELISA detection in ubiquitin-coated wells. 

(c) Competition phage ELISA testing the specificity of the Ubi4 clone. The 
Ubi4 phages were preincubated with 250 ug/ml of soluble proteins, followed by 
phage ELISA as in (b). 

(d) ELISA using free proteins. 

Figure 16. Equilibrium unfolding curves for Ubi4-Fn3 (closed symbols) 
and wild-type Fn3 (open symbols). Squares indicate data measured in TBS (Tris 
HC1 buffer (50 mM, pH 7.5) containing NaCl (1 50 mM)). Circles indicate data 
measured in GlyHCl buffer (20 mM, P H3.3) containing NaCl (300 mM). The 
curves show the best fit of the transition curve based on the two-state model. 
25 Parameters characterizing the transitions are listed in Table 8. 

Figure 17. (a) 'H, ,5 N-HSQC spectrum of [ ,5 N]-Ubi4-K Fn3. 
(b). Difference (6^ - *«> of «H <b) and 15 N (c) chemical shifts plotted 
versus residue number. Values for residues 82-84 (shown as filled circles) where 
TJbi4-K deletions are set to zero. Open circles indicate residues that are mutated 
30 intheUbi4-Kprotein. The locations of p-strands are indicated with arrows. 

Figure 18. (A) Guanidine hydrochloride (GuHCl)-induced denaturation of 
FNfhlO monitored by Trp fluorescence. The fluorescence emission intensity at 



20 
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355 nm is shown as a function of GuHCl concentration. The lines show the best 
fits of the data to the two-state transition model. (B) Stability of FN3 at 4 M 
GuHCl plotted as a function of pH. (C) pH dependence of the m value. 

Figure 19. A two-dimensional H(C)CO spectrum of FNfhlO showing the 
5 13 C chemical shift of the carboxyl carbon (vertical axis) and the ! H shift of ! H P of 
Asp or ! H Y of Glu, respectively (horizontal axis). Cross peaks are labeled with 
their respective residue numbers. 

Figure 20. pH-Dependent shifts of the t3 C chemical shifts of the carboxyl 
carbons of Asp and Glu residues in FNftilO. Panel A shows data for Asp 3, 67 

10 and 80, and Glu 38 and 47. The lines are the best fits of the data to the 

Henderson-Hasselbalch equation with one ionizable group (Mcintosh, L. P., 
Hand, G., Johnson, P. E., Joshi, M. D., Koerner, M., Plesniak, L. A., Ziser, L., 
Wakarchuk, W. W. & Withers, S. G. (1996) BiochemisUy 35, 9958-9966). 
Panel B shows data for Asp 7 and 23 and Glu 9. The continuous lines show the 

15 best fits to the Henderson-Hasselbalch equation with two ionizable groups, while 
the dashed lines show the best fits to the equation with a single ionizable group. 

Figure 21. (A) The amino acid sequence of FNfiilO (SEQ ID NO:121) 
shown according to its topology (Main, A. L., Harvey, T. S., Baron, M., Boyd, J., 
& Campbell, I. D. (1992) Cell 71, 671-678). Asp and Glu residues are 

20 highlighted with gray circles. The thin lines and arrows connecting circles 
indicate backbone hydrogen bonds. (B) A CPK model of FN3 showing the 
locations of Asp 7 and 23 and Glu 9. 

Figure 22. Thermal denaturation of the wild-type and mutant FNfhlO 
proteins at pH 7.0 and 2.4 in the presence of 6.3 M urea and 0. 1 or 1 .0 M NaCl. 

25 Change in circular dichroism signal at 227 nm is plotted as a function of 

temperature. The filled circles show the data in the presence of 1 M NaCl and 
the open circles are data in the presence of 0.1 M NaCl. The left column shows 
data taken at pH 2.4 and the right column at pH 7.0. The identity of proteins is 
indicated in the panels. 

30 Figure 23. GuHCl-induce denaturation of FNfhlO mutants monitored with 

fluorescence. Fluorescence data was converted to the fraction of unfolded 
protein according to the two-state transition model (Loladze, V. V., Ibarra- 
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Molero, B, Sanchez-Ruiz, J. U. ft Mattatadze, G. I. (1999) BiocHemstry 38, 
16419-16423), and plotted as a faction of GuHCl. 

Figure 24. pH Titration of the carboxyl "C resonance of Asp and Glu 
residues in D7N (open circ.es) and D7K (closed circles) FNMO. Da. for the 
5 wild-type (crosses) are also shown for ^parison. Residue names are denoted 
in the individual panels. 

Figure 25. Topographic illustration of to sites for the introduction of the 
eteavage site insertion (GGMGG; SEQ ID NO :1 22) in CD and EF loop 
respectively. 

10 Figures26A-B. Guanidine hydrochloride induced unfolding of mutant 
proteinswitttanengineeredcleavagesite. Circles represent protein without 
insertion, squares an insertion in the CD loop ar,d triangles an insertion » the EF 
,oop 26A-. Comparison of insertion sites. 26B: Comparison of tire effect on wad 
^protein to toBTOQ— proteins. Fitting parameters for thecurves 

15 are listed in table W-l . 

Figure 27. Cleavage of a peptide bond after methionine by cyanogen 

bromide. 

Figure 28. Chromatogram of the reverse phase separation of CD-loop 
cleavedfragmentsofCD92. The actual fragments are marked, additional 

20 fraction: ipartial^^ 

leadersequencOaci^^ f 

Figures 29A-C. 29A shows gel filtration chromatograms of a mature of - 
and C-terminal at 3 „M concentration, 29B shows eluuon of C-terminal alone, 
and 29C shows elution of N-terrninal alone. 
25 Figure 30. >H- 1S N-HSQC spectrum of uncleaved CD92 protein. 

Figures 31A-B. »H-»N-HSQC spectra of the isolated C-terminal fragment 
at 5°C (31A) and at 30°C (31B). At lower temperature, the fragment appears 
mos tly unfolded, while*^^ 
additional, more dispersed peaks. 
30 Figure 32. 'H-^-HSQC spectra of the isolated N-terrninal fragment at 

30*C. The sample is likely in a oligomeric conformation indicated by the 
apparent line broadening. 
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Figures 33A-B. 'H-^N-HSQC spectra of the N-terminal (33A) and the C- 
terminal fragment (33B) in partially labeled complex at 30°C. 

Figures 34A-B. Direct comparison of the parental protein and the complex 
formed by the fragments. 34A shows uncut CD92 at 30°C, and 34B shows an 
5 overlay of both fragments in complex at 30°C. 

Figure 35. The ratio of 1 5N-NOE signal for the N-terminally labeled 
complex over that of the reference spectrum revealed that the formed complex is 
as stable as a fully folded protein. Only 5 residues show an increased motion on 
the investigated timescale, most likely on either terminus of the fragment The 
10 very N-tenninus of FNfhlO is known to be disordered, and the six C-terminal 
residues of this fragment include 4 glycines (sequence GGNGGhS; SEQ ID 
NO: 124), where (hS) stands for the homoserine lactone that resulted in the 
cleavage. Error was estimated from the noise in the spectra to be ±0.29. 

Figure 36. The ratio of 15N-NOE signal for the C-tenninally labeled 
1 5 complex over that of the reference spectrum revealed that the formed complex is 
as stable as a fully folded protein. Error was estimated from the noise in the 
spectra to be ±0.36. 

Figure 37. Time course of the fluorescence intensity due to nonspecific 
adherence to the cuvette. 
20 Figures 38A-F. Representative series of the reconstitution of CD92 

fragments monitored by fluorescence at 1M (38A, 38B), 1.5M (38C, 38D) and 
2M (38E, 38F) urea. For each urea concentration, two separate experiments are 
shown, each displaying fluorescence at the maximum of the fluorescence at 
350 nm (hollow circles) and that averaged over the data of 350 nm to 360 nm 
25 (filled diamonds), along with their respective fitted analytical curves (lines). For 
the calculation, values from the fitting of the averaged curves were used. 

Figure 39. Dependence of the measured dissociation constant on urea 
concentration. 

Figure 40. Dependence of the measured dissociation constant on glycerol. 
30 Figure 41. Far UV CD spectra for the two fragments. Filled circles 
represent the N-terminal, hollow ones the C-terminal. 
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Koure 42: Dependence of the fS-tum inflection seen in the CD spectrum of 
the C-tLinal fragment C-terminal fragment concentration are 100 uM 
(circles), 50 uM (squares), 10 uM (crosses) and 1.5 uM (triangles), where the 
lowest concentration curve was measurrf m buffer equaUo me fluorescence 
5 experiments above. All others were measured in 20 mM sodium phosphate 
buffer a. pH 6. Temperature and Cooperanvity of unfolding change wth 

concentration. m . 

Figure 43. Scheme for library construction using fragment reconstrtutton. 
r L>M. invivoreconstitutionofmonobodies. Yeast strain EOY48 wtth 

DNA binding domain, (FNABQ-NLS-B42 fusion protein, was mated w.th stram 
FJY206 with a plasmid that encodes for a LexA-C terminal half of FNM0 
fusion featuring either wild-type (FNDEFQ) or a monobody FG loop. 
FNEDFG0319 has the FG loop of monobody P YT0319, and FNEDFG4699 mat 
,5 „fmono b odypYT4699,whichhavebeen selected for two different target 
proteins. As a control, EGY48 with pTarget plasmid (Origene) and RFY206 
wap Bai,plasmid(Origene)wereused. After the mated ecus were repeated 
0 „YCGal R af^s-ura-tr pm emas^plemen.edwiml 11 ME2andincubated 

over night, the b-galac.osidase assay was performed using agarose overlay 

20 method. 

Figure 45. Topographic Ulustration of the Fn3 molecule (SEQ ID 

NO:123). ,. , «,™, 

Figure 46. Schematic drawings of vectors for yeast surface dtsplay of FN3 
andFN3 fragments. pYDFNl is for surface display of full-length FN3. Italso 
25 contains tbeX-press epitope tag, V5 epitope tag and His6 tag for detection of 
displayed*!*. pGalAgaFN(C)V5 is for surface display of an FN3 fragment 
(residues 43-94) mat is fused to V5 and His6 tags. pGalsecFn(K)FLAG ts for 
secretion of an FN3 fragment (residues 1 -42) that is fused to the FLAG tag. 
Figure 47. FACS analysis of surface expression of FN3 fragments. The 

of the FN3 C-tenninal fragment that is fused to the Aga2 protein and anchored 
on the cell surfacs. The vertical axis indicates the fluorescence intenstty of PE, 
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which indicates the amount of the FN3 N-terminal fragment that is secreted as a 
soluble protein. Each dot represents one yeast cell. 

(A) Yeast cells expressing the wild-type C-terminal fragment only. (B) 
Yeast cells expressing both the wild-type N- and C-terminal fragments. (C) 
5 Yeast cells expressing only the C-terminal fragment of the streptavidin binding 
monobody, ST AVI . (D) Yeast cells expressing the wild-type N-termnal 
fragment and the C-terminal fragment of STAVL (E) Yeast cells expressing 
only the wild-type N-termnal fragment. 

1 0 DETAILED DESCRIPTION OF THE INVENTION 

For the past decade the immune system has been exploited as a rich source 
of de novo catalysts. Catalytic antibodies have been shown to have 
chemoselectivity, enantioselectivity, large rate accelerations, and even an ability 
to reroute chemical reactions. In most cases the antibodies have been elicited to 

15 transition state analog (TS A) haptens. These TS A haptens are stable, low- 
molecular weight compounds designed to mimic the structures of the 
energetically unstable transition state species that briefly (approximate half-life 
10~ 13 s) appear along reaction pathways between reactants and products. 
Anti-TSA antibodies, like natural enzymes, are thought to selectively bind and 

20 stabilize transition state, thereby easing the passage of reactants to products. 
Thus, upon binding, the antibody lowers the energy of the actual transition state 
and increases the rate of the reaction. These catalysts can be programmed to 
bind to geometrical and electrostatic features of the transition state so that the 
reaction route can be controlled by neutralizing unfavorable charges, overcoming 

25 entropic barriers, and dictating stereoelectronic features of the reaction. By this 
means even reactions that are otherwise highly disfavored have been catalyzed 
(Jandae/a/. 1997). Further, in many instances catalysts have been made for 
reactions for which there are no known natural or man-made enzymes. 

The success of any combinatorial chemical system in obtaining a particular 

30 function depends on the size of the library and the ability to access its members. 
Most often the antibodies that are made in an animal against a hapten that 
mimics the transition state of a reaction are first screened for binding to the 
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fcap.en and then screened again for catalytic activity. An improved «*- 

therebv linking chemistry and replication. 
5 pJvJIbyaddingrandon^antibodysene.ton.egenethatencodesa.e 

^ent.andantiWy^en.thatbindstoatargetcanbe.dennfied.y 

10 amplifying the associated DTSt A. 

^L^n-antig^— -havea.Httec^ 

rldy.oilte.ctwiu.natives^. active— on 4c concept 
^opposite, o-— — — 

chelaireactionensues. ^ this sanre ch^ica. reaction becomes pa* * 
nanisn.of^c-^cevent.n.ace^sen.eoneUi^^gw.tha 

^l,™*. — Keacnve^unogenscan* 
2 „ .ee.ceptthat.neyare^^einversewayin^-.eadof—ga 

mechanism, they induce amechanism. 

M^made catalyuc anybodies have considerate conunercal pofcntta, m 

JLft* in prototype experiments in therapeutic applied, such as 
such as biosensors and organic synthesis. 
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with an appropriate hapten. Catalytic antibodies also could be used as clinical 
diagnostic tools or as regioselective or stereoselective catalysts in the synthesis 
of fine chemicals. 

5 I. Mutation of Fn3 loops and grafting of Ab loops onto Fn3 

An ideal scaffold for CDR grafting is highly soluble and stable. It is small 
enough for structural analysis, yet large enough to accommodate multiple CDRs 
so as to achieve tight binding and/or high specificity. 

A novel strategy to generate an artificial Ab system on the framework of an 

1 0 existing non-Ab protein was developed. An advantage of this approach over the 
minimization of an Ab scaffold is that one can avoid inheriting the undesired 
properties of Abs. Fibronectin type HI domain (Fn3) was used as the scaffold. 
Fibronectin is a large protein which plays essential roles in the formation of 
extracellular matrix and cell-cell interactions; it consists of many repeats of three 

15 types (I, II and JS) of small domains (Baron et al , 1991). Fn3 itself is the 
paradigm of a large subfamily (Fn3 family or s-type Ig family) of the 
immunoglobulin superfamily (IgSF). The Fn3 family includes cell adhesion 
molecules, cell surface hormone and cytokine receptors, chaperonins, and 
carbohydrate-binding domains (for reviews, see Bork & Doolittle, 1992; Jones, 

20 1993; Bork et al, 1994; Campbell & Spitzfaden, 1994; Harpez & Chothia, 
1994). 

Recently, crystallographic studies revealed that the structure of the DNA 
binding domains of the transcription factor NF-kB is also closely related to the 
Fn3 fold (Ghosh et al, 1995; Miiller et al, 1995). These proteins are all 

25 involved in specific molecular recognition, and in most cases ligand-binding 
sites are formed by surface loops, suggesting that the Fn3 scaffold is an excellent 
framework for building specific binding proteins. The 3D structure of Fn3 has 
been determined by NMR (Main et al , 1 992) and by X-ray crystallography 
(Leahy et al, 1992; Dickinson et aL 9 1994). The structure is best described as a 

30 P-sandwich similar to that of Ab VH domain except that Fn3 has seven P~strands 
instead of nine (Fig. 1). There are three loops on each end of Fn3; the positions 
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of .he BC, DE and FG loops approximately correspond ,o .hose of CDR1, 2 and 

1 nf the VH domain, respectively (Fig. 1 C, D). 

rissnraU^Sresid^.—c.so^eands.ab.e.nrsoneo 

7 ™ oresent iust in human fibronectin, providing important information 
domains are present just rn 

onconservedresidues which are oftennnpor^tfor me staM ty 
se fl uencealignmen^Main«a(.,1992andD 1 ctan i «>n a , I /.,1994). Fro 
sequence*"^ loops, suggesting 

10 — ^ 7^~r:ltL V LmatmeF G 

^.tructureofhuraangrowmhonnone-receptorcomplexCdeVos^^, 
15 SLesecondFnSdomamofmereceptormte^wimhormone^eFG 

„globuhn domains, with seven p stands forming two arrtrparane! P 
immunogiouui , The structure of the 

2Q sh ee te ,wWchpackagainsteachother(Mam,«a;.,1992).-rh 

^nmodmecon.istsofsevenpstrands.whichformas— oftwo 

Irfe, p sheets, one containing three strands (ABE) and the other four 
annparanep ,„, ,o 8 81 The triple-stranded p sheet consists of 

strands (CCFG) (Williams eral, 1988). The trip 

25 majority of the conserved residues contribute to the hydrophobic core, 
25 majontyorm Tm w md Trv-68 lying toward the N-terminal 

invariant hydrophobic residues Trp-22 and Try o» ying 
^Cternunaundsofmecorcrespectivdy. The p strands are much less 

30 

gg^ecgnstructioii and mntagenesis 
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A synthetic gene for tenth Fn3 of human fibronectin (Fig. 2) was designed 
which includes convenient restriction sites for ease of mutagenesis and uses 
specific codons for high-level protein expression (Gribskov et al, 1984). 

The gene was assembled as follows: (1) the gene sequence was divided into 
5 five parts with boundaries at designed restriction sites (Fig.2); (2) for each part, a 
pair of oligonucleotides that code opposite strands and have complementary 
overlaps of ~ 15 bases was synthesized; (3) the two oligonucleotides were 
annealed and single strand regions were filled in using the Klenow fragment of 
DNA polymerase; (4) the double-stranded oligonucleotide was cloned into the 
1 0 pET3a vector (Novagen) using restriction enzyme sites at the termini of the 
fragment and its sequence was confirmed by an Applied Biosystems DNA 
sequencer using the dideoxy termination protocol provided by the manufacturer, 
(5) steps 2-4 were repeated to obtain the whole gene (plasmid pAS25) (Fig. 7). 
Although the present method takes more time to assemble a gene than the 
15 one-step polymerase chain reaction (PCR) method (Sandhu et al, 1992), no 

mutations occurred in the gene. Mutations would likely have been introduced by 
the low fidelity replication by Taq polymerase and would have required time- 
consuming gene editing. The gene was also cloned into the pET15b (Novagen) 
vector (pEWl). Both vectors expressed the Fn3 gene under the control of 
20 bacteriophage T7 promoter (Studler et al. 1 990); pAS25 expressed the 96- 
residue Fn3 protein only, while pEWl expressed Fn3 as a fusion protein with 
poly-histidine peptide (His-tag). Recombinant DNA manipulations were 
performed according to Molecular Cloning (Sambrook et al, 1989), unless 
otherwise stated. 

25 Mutations were introduced to the Fn3 gene using either cassette 

mutagenesis or oligonucleotide site-directed mutagenesis techniques (Deng & 
Nickoloff, 1992). Cassette mutagenesis was performed using the same protocol 
for gene construction described above; double-stranded DNA fragment coding a 
new sequence was cloned into an expression vector (pAS25 and/or pEWl). 

30 Many mutations can be made by combining a newly synthesized strand (coding 
mutations) and an oligonucleotide used for the gene synthesis. The resulting 
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genes were fenced to confirm that the designed n— and no other 
notations were jntroducedby mutagenesis reachons. 

..eso.cesofloops.obesranedon.Fn3. Anti-hen egg, 
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Table 1. Amino acid sequences of D1.3 VH CDR3, VH8 CDR3 and Fn3 FG 
loop and list of planned mutants. 





96 100 105 

• • • 




D1.3 


ARERDYRLDYWGOG 


(SEQIDNO:!) 


VH8 


ARGAVVS YYAMD YWGOG fSEOIDNO^ 




75 80 85 

• ♦ # 




Fn3 


YAVTGRGDSPASSKPI 


(SEQID 
NO:3) 


Mutant 


Sequence 


Dl.3-1 


YAERDYRLDY PI 


(SEQID 
NO:4) 


Dl.3-2 


YAVRDYRLDY PI 


(SEQ ID 
NO:5) 


Dl.3-3 


YAVRDYRLDYASSKPI 


(SEQ ID 
NO:6) 


Dl.3-4 


YAVRDYRLDY KPI 


(SEQID 
NO:7) 


Dl.3-5 


YAVRDYR SKPI 


(SEQ ID NO:8) 


D 1.3-6 


YAVTRDYRL — SSKPI 


(SEQ ID 
NO:9) 


Dl.3-7 


YAVTERDYRL-SSKPI 


(SEQID 
NO: 10) 


VH8-1 


YAVAVVSYYAMD Y-PI 


(SEQID 
NO: 11) 


VH8-2 


YAVTAVVSYYASSKPI 


(SEQID 
NO: 12) 



Underlines indicate residues in (3-strands. Bold 
characters indicate replaced residues. 

20 

In addition, an anti-HEL single VH domain termed VH8 (Ward et a!., 1989) 
was chosen as a template. VH8 was selected by library screening and, in spite of 
the lack of the VL domain, VH8 has an affinity for HEL of 27 nM, probably due 
25 to its longer VH-CDR3 (Table 1). Therefore, its VH-CDR3 was grafted onto 
Fn3. Longer loops may be advantageous on the Fn3 framework because they 
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may provide higher affinity and also are close to the loop length of wild-type 
Fn3. The 3D structure of VH8 was not known and thus the VH8 CDR3 
sequence was aligned with that of Dl .3 VH-CDR3; two loops were designed 
(Table 1). 



M«t an t rnnstructio" and production 

Site-directed mutagenesis experiments were performed to obtain designed 
sequences. Two mutant Fn3s, Dl.3-1 and Dl.3-4 (Table 1) were obtained and 
both were expressed as soluble His-tag fusion proteins. Dl.3-4 was purified and 
10 theHis-tagportionwasremovedbythrombincleavage. Dl.3-4 is soluble up to 
at least 1 mM at pH 7.2. No aggregation of the protein has been observed during 
sample preparation and NMR data acquisition. 

p, ntPin re pressio n purification 

15 E. coli BL21 (DE3) (Novagen) were transformed with an expression vector 
(pAS25, pEWl and their derivatives) containing a gene for the wild-type or a 
mutant. Cells were grown in M9 rrunimal medium and M9 medium 
supplemented with Bactotrypton (Difco) containing ampicillin (200 ug/ml). For 
isotopic labeling, 15 N NH 4 C1 and/or I3 C glucose replaced unlabeled components. 
20 500 ml medium in a 2 liter baffle flask were inoculated with 10 ml of overnight 
culture and agitated at 37C. Isopropylthio-P-galactoside (IPTG) was added at a 
final concentration of 1 mM to initiate protein expression when OD (600 nm) 
reaches one. The cells were harvested by centrifugation 3 hours after the 
addition of IPTG and kept frozen at -70*C until used. 
25 Fn3 without His-tag was purified as follows. Cells were suspended in 

5 ml/(g cell) of Tris (50 mM, pH 7.6) containing emylenediaminetetraacetic acid 
(EDTA; 1 mM) and phenylmethylsulfonyl fluoride (1 mM). HEL was added to a 
final concentration of 0.5 mg/ml. After incubating the solution for 30 minutes at 
37'C, it was sonicated three times for 30 seconds on ice. Cell debris was 
30 removedby centrifugation. Ammonium sulfate was added to the solution and 
precipitate recovered by centrifugation. The pellet was dissolved in 5-10 ml 
sodium acetate (50 mM, pH 4.6) and insoluble material was removed by 

24 



WO 03/104418 



PCT/US03/18030 



centrifugation. The solution was applied to a Sephacryl S100HR column 
(Pharmacia) equilibrated in the sodium acetate buffer. Fractions containing Fn3 
then was applied to a Resources column (Pharmacia) equilibrated in sodium 
acetate (50 mM, pH 4.6) and eluted with a linear gradient of sodium chloride (0- 
5 0.5 M). The protocol can be adjusted to purify mutant proteins with different 
surface charge properties. 

Fn3 with His*tag was purified as follows. The soluble fraction was 
prepared as described above, except that sodium phosphate buffer (50 mM, pH 
7.6) conta inin g sodium chloride (100 mM) replaced the Tris buffer. The solution 
10 was applied to a Hi-Trap chelating column (Pharmacia) preloaded with nickel 
and equilibrated in the phosphate buffer. After washing the column with the 
buffer, His*tag-Fn3 was eluted in the phosphate buffer containing 50 mM 
EDTA. Fractions containing His»tag-Fn3 were pooled and applied to a 
Sephacryl SI 00-HR column, yielding highly pure protein. The His-tag portion 
1 5 was cleaved off by treating the fusion protein with thrombin using the protocol 
supplied by Novagen. Fn3 was separated from the His-tag peptide and thrombin 
by a Resources column using the protocol above. 

The wild-type and two mutant proteins so far examined are expressed as 
soluble proteins. In the case that a mutant is expressed as inclusion bodies 
20 (insoluble aggregate), it is first examined if it can be expressed as a soluble 

protein at lower temperature (e.g. y 25-30'C). If this is not possible, the inclusion 
bodies are collected by low-speed centrifugation following cell lysis as described 
above. The pellet is washed with buffer, sonicated and centrifuged. The 
inclusion bodies are solubilized in phosphate buffer (50 mM, pH 7.6) containing 
25 guanidinium chloride (GdnCl, 6 M) and will be loaded on a Hi-Trap chelating 
column. The protein is eluted with the buffer containing GdnCl and 50 mM 
EDTA. 

Conformation of mutant Fn3, D13-4 

30 The 'H NMR spectra of His*tag Dl.3-4 fusion protein closely resembled 

that of the wild-type, suggesting the mutant is folded in a similar conformation to 
that of the wild-type. The spectrum of Dl.3-4 after the removal of the His*tag 
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eharaoteristtoofaP-sheetprotem{W«hrioh,1986). 

THe 2D NOESY spectrum of Dl .3-4 provided further evidence for a 
, * Th^resion in the spectrum showed interactions 
, ™«!erved conformation. The region muipop 

and 0 37 pom- (Baron « <./., 1992)). — =es corresponding to the two 
and 0.37 ppm, IB ^ Q ^ ^ 

methylprotoiisarepresentintheDl^specirunn 

..^Ather conserved cross peaks 

10 — — — -SI. mgUyltely thoseof 

indicate that the two resonances in the ui-wp* 

»«■ ^nordifferences^eenme^ospectraarepr^^eto 
ilcturaiperturhationduetomemu.uons. 

Lonivfourresiduesawav^memu^residuesofu.eFG.c.p^.e.). 

actions in me loop (more than 10% of total residues; F,g. 12, Table 2 013-4 
I Therefore the result provide strong support mat Uie FG loop , 

20 ^^^^^^ 

thusthattheFGloop can be mutated extensively. 

Table 2. Sequences of oBgonucleotides 

25 



Name 
FN IF 

FN1R 

30 

FN2F 



Sequence 

CGGGATCCCATATGCAGGTTTCrGATGTTCCGCGTG 

ACCTGGAAGTTGTTGCTGCGACC (SEQ ID NO:13) 
TAACTGCAGGAGCATCCCAGCTGATCAGCAGGCTA 

GTCGGGGTCGCAGCAACAAC (SEQ ID NO:14) 
CTCCTGCAGTTACCGTGCGTTATTACCGTATCACGT 

ACGGTGAAACCGGTG (SEQ ID NO:15) 
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FN2R 



FN3F 



5 FN3R 



FN4F 



FN4R 

10 

FN5F 



FN5R 



15 FN5R' 



gene3F 



gene3R 



20 



G TGAATTC CTGAACCGGGGAGTTACCACCGGTTTCA 
CCG (SEQ ID NO: 16) 

AGGAATTCACTGTACCTGGTTCCAAGTCTACTGCTA 
CCATCAGCGG (SEQ ID NO: 17) 

GTATAGTCGACACCCGGTTTCAGGCCGCTGATGGTA 
GC (SEQIDNO:18) 

CGGGTGTCGACTATACCATCACTGTATACGCT (SEQ 
ID NO: 19) 

CGGGATCCGAGCTCGCTGGGCTGTCACCACGGCCA 

GTAACAGCGTATACAGTGAT (SEQ ID NO:20) 
CAGCGAGCTCCAAGCCAATCTCGATTAACTACCGT 

(SEQIDNO:21). 

CGGGATCCTCGAGTTACTAGGTACGGTAGTTAATCG 
A(SEQIDNO:22) 

CGGGATCCACGCGTGCCACCGGTACGGTAGTTAAT 

CGA(SEQIDNO:23) 
CGGGATCCACGCGTCCATTCGTTTGTGAATATCAAGGCCA 

ATCG(SEQIDNO:24) 

CCGGAAGCITTAAGACTCCTTATTACGCAGTATGTTAGC 
(SEQIDNO:25) 



38TAABgin 



BC3 



25 FG2 



FG3 



FG4 



30 



CTGTTACTGGCCGTGAGATCTAACCAGCGAGCTCCA 
(SEQ ID NO:26) 

GATCAGCTGGGATGCTCCTNNKNNKNNKNNKNNKT 

ATTACCGTATCACGTA (SEQ ID NO:27) 
TGTATACGCTGTTACTGGCMNKNNKNNKNNK^ 

NKNNKTCCAAGCCAATCTCGAT (SEQ ID NO:28) 
CTGTATACGCTGTTACTGGCNNKNNKNNKN^ 

GCGAGCTCCAAG (SEQ ID NO:29) 
CATCACTGTATACGCTGTTACTNNKNNKNNKNNKN 

NKTCCAAGCCAATCTC (SEQ ID NO:30) 
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• sites are underlined. N and K denote an equimolar mixture 

Restriction enzyme sites are unuen 



of a, T . G and C and that of G and T, respectively. 
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^ ( Beo M& Sc h e lto a n , 1 9 8 7 ; PaoeeMM989). Nonl.ne,^- 
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20 MaCtat ° Sh ^ of wo seiected mutant Fn3s were studied; the 

The structure and stability ot two 
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Stability was also determined by guanidinium chloride (GdnCl)- and urea- 
induced unfolding reactions. Preliminary unfolding curves were recorded using 
a fluorometer equipped with a motor-driven syringe; GdnCl or urea were added 
continuously to the protein solution in the cuvette. Based on the preliminary 

5 unfolding curves, separate samples containing varying concentration of a 
denaturant were prepared and fluorescence (excitation at 290 ran, emission at 
300-400 ran) or CD (ellipticity at 222 and 215 ran) were measured after the 
samples were equilibrated at the measurement temperature for at least one hour. 
The curve was fitted by the least-squares method to the equation for the two-state 

10 model (Santoro & Bolen, 1988; Koide et al, 1993). The change in protein 
concentration was compensated if required. 

• Once the reversibility of the thermal unfolding reaction is established, the 
unfolding reaction is measured by a Microcal MC-2 differential scanning 
calorimeter (DSC). The cell (- 1 .3 ml) will be filled with FnAb solution (0.1 - 

1 5 1 mM) and ACp (= AH/AT) will be recorded as the temperature is slowly raised. 
T m (the midpoint of unfolding), AH of unfolding and AG of unfolding is 
determined by fitting the transition curve (Privalov & Potekhin, 1986) with the 
Origin software provided by Microcal. 

20 Thermal unfolding 

A temperature-induced unfolding experiment on Fn3 was performed using 
circular dichroism (CD) spectroscopy to monitor changes in secondary structure. 
The CD spectrum of the native Fn3 shows a weak signal near 222 ran (Fig. 3A), 
consistent with the predominantly B-structure of Fn3 (Perczel et al, 1992). A 
25 cooperative unfolding transition is observed at 80-90'C, clearly indicating high 
stability of Fn3 (Fig. 3B). The free energy of unfolding could not be determined 
due to the lack of a post-transition baseline. The result is consistent with the 
high stability of the first Fn3 domain of human fibronectin (Litvinovich et al, 
1992), thus indicating that Fn3 domains are in general highly stable. 

30 

Binding assays 
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Uebindingreacuons of monobodies were characterized quantitative* 
nsing an isothermal to^n ca.onme.er (FTC) and fluorescence spectroscopy. 
^eenthdpychangeWofbtodmgweremeasurednsingaMrcrocal 

OmegatTC (Wiseman^., 1989). Th. sample ceU (- 1.3 ml) was fifed wrth 
5 MonLyso.uHon^OO.M.cbangedaccordmgto^andmereferen.cen 

^edwimdisfflledwa^nresys^wase^ffibratedatagiventemperature 

mti Usta W eba S eUne i s„b^ed;5. 2 0,,o f U gm d Sotati o„(,2n^was 

^ectedbyamotor^vens^gewitmnashortdnraUonC^secUoowedb, 

observed heat change as a taction of iigand concentrate AH and K, was 
oetemnned (Wiseman^/., 1989). AG and AS of the binding reaction was 
deduced fommetwodirectlymeasured parameters. Deviation from** 

15 SL- werealsobeperformedbyplacmgaUgandinmec.nand^anng 
wilauFnAKltshouidbe^^^onlylTCgivesdirectmeasurement 
ofAH>erebyma1dngitpossib.e.„evaluateentna.picandentrop.c 
conMbunonstothebindingenergy. 1TC was success** y used, o— the 
bi ndingreactionofureD1.3 Ab (Teilo era/., 1993;Bha«*«/., 1994). 

aesub-pMrangewhereftedeterrnmanonofK.byrrCisdifficuU.Trp 
flnorescenco (excitation at- 290 nm, emission a. 300-350 nm) and Tyr 

F n3.muttn«so.ution( S .OuM^sutratedwimligandsoluhon^ 100 uM). K, 
25 ofmereactionisdeterrnmedbymenonimearleast-sqnaresfiningofme 

bimCecmarbindingeonano, Presence of secondly binding sites rsexannned- 

^g Scatchard anaiysis. In all binding assays, conh-ol experiments are 
p erfonnedbus m gwUd-typeFn3(orun rc la te dmonobodies)inplaceof 

monobodies of interest. 

30 • 

Monobodies 
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Library screening was carried out in order to select monobodies that bind to 
specific ligands. This is complementary to the modeling approach described 
above. The advantage of combinatorial screening is that one can easily produce 
and screen a large number of variants 0 10 s ), which is not feasible with specific 
5 mutagenesis ("rational design") approaches. The phage display technique 

(Smith, 1985; O'Neil & Hoess, 1995) was used to effect the screening processes. 
Fn3 was fused to a phage coat protein (pin) and displayed on the surface of 
filamentous phages. These phages harbor a single-stranded DNA genome that 
contains the gene coding the Fn3 fusion protein. The amino acid sequence of 

10 defined regions of Fn3 were randomized using a degenerate nucleotide sequence, 
thereby constructing a library. Phages displaying Fn3 mutants with desired 
binding capabilities were selected in vitro, recovered and amplified. The amino 
acid sequence of a selected clone can be identified readily by sequencing the Fn3 
gene of the selected phage. The protocols of Smith (Smith & Scott, 1993) were 

1 5 followed with minor modifications. 

The objective was to produce Monobodies which have high affinity to small 
protein ligands. HEL and the Bl domain of staphylococcal protein G (hereafter 
referred to as protein G) were used as ligands. Protein G is small (56 amino 
acids) and highly stable (Minor & Kim, 1994; Smith et al, 1994). Its structure 

20 was determined by NMR spectroscopy (Gronenborn et al , 1 99 1 ) to be a helix 
packed against a four-strand P-sheet The resulting FnAb-protein G complexes 
(~ 1 50 residues) is one of the smallest protein-protein complexes produced to 
date, well within the range of direct NMR methods. The small size, the high 
stability and solubility of both components and the ability to label each with 

25 stable isotopes ( 13 C and l5 N; see below for protein G) make the complexes an 
ideal model system for NMR studies on protein-protein interactions. 

The successful loop replacement of Fn3 (the mutant D 1.3-4) demonstrate 
that at least ten residues can be mutated without the loss of the global fold. 
Based on this, a library was first constructed in which only residues in the FG 

30 loop are randomized. After results of loop replacement experiments on the BC 
loop were obtained, mutation sites were extended that include the BC loop and 
other sites. 
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fwtr-rtion of FlU ph»1»- "M»V systeni 

An M13 phage-based expression vector pASMl has been constructed as 
Ukm* an oligopeptide coding die signal peptide of OmpT was cloned at 
the 5' end of the Fn3 gene; a gene fragment coding the C-terminal domam of 
5 M13 pm was prepared from the wild-type gene ffl gene of M13 mplS usmg 
PCR (Corey et al, 1993)andthe ftagment was inserted at the3' end ofthe 
0m pT.Fn3gene ; aspacer S e q uen« ta been to se rt edhetweenFn3andp m The 

M13 mpl8, where me fusion gene is under the control of the lac promote, Tms 
10 systemwn 1 produce«he M .prafa S ionpn>temaswenas te wUd.^pin 

protein The compression of wild-type pHI is expect to reduce the number of 

(five copies of pin are present on a phage particle). In addition, a smaBer 
numberoffusionpm protein maybe advantageous in selecting tight btndmg 
,5 proteins^ecausemechelatingeffectduetomultiplebindingsitesshouldbe 

^ than that with all five copies of fusion pm (Bass e, a,., 1990). Tnrs 

Phages were produoedand purified usingfi coKK91kan (Smith* Scott, 1993) 
aeeorfingto astandard method (Sambrookc,,;., 1989) except ma. phage 
20 particles were purified by a second polyethylene glycol precipitation and actd 

precipitation. 

Successful display of Fn3 on fusion phages has beer, confirmed by FJJSA 

construct Ubraries using this system. 
25 An alternative system using the BUSES (Parmley ft Smith, 1988) may also 
b. used. The Fn3 gene is inserted to fUSES using me SO restriction sites 
introduced at the 5'- and 3'- ends ofthe Fn3 gene PCR. This system displays 
only mefusionpmproteuKup to five copies)on the surface of a phage. Phages 
„ produce, and purified as described (Smim * Scott, 1993). This system has 
30 beenusedtodisplaymanyproteinsandisrobust. The advantage of fUSE5.s.ts 
,„w toxicity. This is due to the low copy numbe, ofthe replication form (RF) » 
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the host, which in turn makes it difficult to prepare a sufficient amount of RF for 
library construction (Smith & Scott 5 1993). 

Construction of libraries 

5 The first library was constructed of the Fn3 domain displayed on the surface 

of Ml 3 phage in which seven residues (77-83) in the FG loop (Fig. 4D) were 
randomized. Randomization will be achieved by the use of an oligonucleotide 
containing degenerated nucleotide sequence. A double-stranded nucleotide was 
prepared by the same protocol as for gene synthesis (see above) except that one 

1 0 strand had an (NNK) 6 (NNG) sequence at the mutation sites, where N 

corresponds to an equimolar mixture of A, T, G and C and K corresponds to an 
equimolar mixture of G and T. The (NNG) codon at residue 83 was required to 
conserve the Sad restriction site (Fig. 2). The (NNK) codon codes all of the 
20 amino acids, while the NNG codon codes 14. Therefore, this library 

1 5 contained ~ 1 0 9 independent sequences. The library was constructed by ligating 
the double-stranded nucleotide into the wild-type phage vector, pASMl, and the 
transfecting£. coli XL1 blue (Stratagene) using electroporation. XL1 blue has 
the lacl q phenotype and thus suppresses the expression of the Fn3-pIH fusion 
protein in the absence of lac inducers. The initial library was propagated in this 

20 way, to avoid selection against toxic Fn3-pHI clones. Phages displaying the 
randomized Fn3-pIII fusion protein were prepared by propagating phages with 
K9 lkan as the host K91kan does not suppress the production of the fusion 
protein, because it does not have lacl q . Another library was also generated in 
which the BC loop (residues 26-20) was randomized. 

25 

Selection of displayed Monobodies 

Screening of Fn3 phage libraries was performed using the biopanning 
protocol (Smith & Scott, 1993); a ligand is biotinylated and the strong biotin- 
streptavidin interaction was used to immobilize the ligand on a streptavidin- 
30 coated dish. Experiments were performed at room temperature (~ 22 °C). For 
the initial recovery of phages from a library, 10 \ig of a biotinylated ligand were 
immobilized on a streptavidin-coated polystyrene dish (35 mm, Falcon 1008) 
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and then a phage solution (containing - 10" pfo (plaque-forming unit)) was 
added. After washing the dish with an appropriate buffer (typically TBST, Tris- 
HC1 (50 mM, pH 7.5), NaCl (150 mM) and Tween 20 (0.5%)), bound phages 
were eluted by one or combinations of the following conditions: low pH, an 
5 addition of a free ligand, urea (up to 6 M) and, in the case of anti-protein G 
Monobodies, cleaving the protein G-biotin linker by thrombin. Recovered 
phages were amplified using the standard protocol using K91kan as the host 
(Sambrook et aL, 1989). The selection processes were repeated 3-5 times to 
concentrate positive clones. .From the second round on, the amount of the ligand 
10 were gradually decreased (to - lug) and the biotinylated ligand were mixed 
with a phage solution before transferring a dish (G. P. Smith, personal 
communication). After the final round, 10-20 clones were picked, and their 
DNA sequence will be determined. The ligand affinity of the clones were 
measured first by the phage-ELISA method (see below). 
15 To suppress potential binding of the Fn3 framework (background binding) 
to a ligand, wild-type Fn3 may be added as a competitor in the buffers. In 
addition, unrelated proteins (e.g., bovine serum albumin, cytochrome c and 
RNase A) maybe used as competitors to select highly specific Monobodies. 

20 Bindin g assay 

The binding affinity of Monobodies on phage surface is characterized semi- 
quantitatively using the phage ELISA technique (Li et aL t 1995). Wells of 
microliter plates (Nunc) are coated with a ligand protein (or with streptavidin 
followed by the binding of a biotinylated ligand) and blocked with the Blotto 
25 solution (Pierce). Purified phages (~ 10 10 pfu) originating from single plaques 
(M13)/colonies (fUSE5) are added to each well and incubated overnight at 4°C. 
After washing wells with an appropriate buffer (see above), bound phages are 
detected by the standard ELISA protocol using anti-M13 Ab (rabbit, Sigma) and 
anti-rabbit Ig-peroxidase conjugate (Pierce) or using anti-M13 Ab-peroxidase 
30 conjugate (Pharmacia). Colormetric assays are performed using TMB (3,3 ',5,5'- 
tetramethylbenzidine, Pierce). The high affinity of protein G to 
immunoglobulins presents a special problem; Abs cannot be used in detection. 
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Therefore, to detect anti-protein G Monobodies, fusion phages are immobilized 
in wells and the binding is then measured using biotinylated protein G followed 
by the detection using streptavidin-peroxidase conjugate. 

5 Production of soluble Monobodies i 

After preliminary characterization of mutant Fn3s using phage ELISA, 
mutant genes are subcloned into the expression vector pEWl . Mutant proteins 
are produced as His^tag fusion proteins and purified, and their conformation, 
stability and ligand affinity are characterized. 

10 

in. Increased Stability of Fn3 Scaffolds 

The definition of "higher stability" of a protein is the ability of a protein to 
retain its three-dimensional structure required for function at a higher 
temperature (in the case of thermal denaturation), and in the presence of a higher 
1 5 concentration of a denaturing chemical reagent such as guanidine hydrochloride. 
This type of "stability" is generally called "conformational stability." It has been 
shown that conformational stability is correlated with resistance against 
proteolytic degradation, i.e., breakdown of protein in the body (Kamtekar et al 
1993). 

20 Improving the conformational stability is a major goal in protein 

engineering. Here, mutations have been developed by the inventor that enhance 
the stability of the fibronectin type HI domain (Fn3). The inventor has developed 
a technology in which Fn3 is used as a scaffold to engineer artificial binding 
proteins (Koide et al, 1 998). It has been shown that many residues in the 

25 surface loop regions of Fn3 can be mutated without disrupting the overall 
structure of the Fn3 molecule, and that variants of Fn3 with a novel binding 
function can be engineered using combinatorial library screening (Koide et al, 
1998). The inventor found that, although Fn3 is an excellent scaffold, Fn3 
variants that contain large number of mutations are destabilized against chemical 

30 denaturation, compared to the wild-type Fn3 protein (Koide et al, 1998). Thus, 
as the number of mutated positions are mutated in order to engineer a new 
binding function, the stability of such Fn3 variants further decreases, ultimately 
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leading to marginally stable proteins. Because artificial binding proteins must 
maintain their three-dimensional structure to be functional, stability limits the 
numberofmutations that can be introduced in the scaffold. Thus, modifications 
of the Fn3 scaffold that increase its stability are useful in that they allow one to 
5 introduce more mutations for better function, and that they make it possible to 
use Fn3-based engineered proteins in a wider range of applications. 

The inventor found that wild-type Fn3 is more stable at acidic pH than at 
neutral pH (Koide et al, 1998). The P H dependence of Fn3 stability is 
characterized in Figure 1 8. The pH dependence curve has an apparent transition 
10 midpoint near pH 4 (Figure 18). These results suggest mat by identifying and 
removing destabilizing interactions in Fn3 one is able to improve the stability of 
Fn3 at neutral pH. It should be noted that most applications of engineered Fn3, 
such as diagnostics, therapeutics and catalysts, are expected to be used near 
neutral pH, and thus it is important to improve the stability at neutral pH. 
1 5 Studies by other investigators have demonstrated that the optimization of surface 
electrostatic properties can lead to a substantial increase in protein stability (Perl 
et al. 2000, Spector et al. 1999, Loladze et al. 1999, Grimsley et al. 1999). 

The pH dependence of Fn3 stability suggests that amino acids with pK a near 
4 are involved in the observed transition. The carboxyl groups of aspartic acid 
20 (Asp) and glutamic acid (Glu) have V K a in this range (Creighton, T.E. 1993). It 
is well known that if a carboxyl group has unfavorable {i.e. destabilizing) 
interactions in a protein, its p*. is shifted to a higher value from its standard, 
unperturbed value (Yang and Honig 1992). Thus, the V K a values of all carboxyl 
groups in Fn3 were determined using nuclear magnetic resonance (NMR) 
25 spectroscopy, to identify carboxyl groups with unusual V K a % as shown below. 

First, the 13 C resonance for the carboxyl carbon of each Asp and Glu residue 
were assigned (Figure 19). Next pH titration of 13 C resonances was performed 
for these groups (Figure 20). The V K a values for these residues are listed in 
Table 3. 

30 
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The standard deviation in the pK a values are less than 0.05 pH units. 
1 0 *Data for D7 and D23 were fitted with a transition curve with two pK a values. 

These results show that Asp 7 and 23, and Glu 9 have up-shifted pK a 's with 
respect to their unperturbed p^'s (approximately 4.0), indicating that these 
residues are involved in unfavorable interactions. In contrast, the other Asp and 
1 5 Glu residues have p^s close to the respective unperturbed values, indicating 
that the carboxyl groups of these residues do not significantly contribute to the 
stability of Fn3. 

In the three-dimensional structure of Fn3 (Main et al 1992), Asp 7 and 
23, and Glu 9 form a patch on the surface (Figure 21), with Asp 7 centrally 

20 located in the patch. This spatial proximity of these negatively charged residues 
explains why these residues have unfavorable interactions in Fn3. At low pH 
where these residues are protonated and neutral, the unfavorable interactions are 
expected to be mostly relieved. At the same time, the structure suggests that the 
stability of Fn3 at neutral pH could be improved if the electrostatic repulsion 

25 between these three residues is removed. Because Asp 7 is centrally located 
among the three residues, it was decided to mutate Asp 7. Two mutants were 
prepared, D7N and D7K (z.e., the aspartic acid at amino acid residue number 7 
was substituted with an asparagine residue or a lysine residue, respectively). The 
former replaces the negative charge with a neutral residue of virtually the same 

30 size. The latter places a positive charge at residue 7. 

The degrees of stability of the mutant proteins were characterized in 
thermal and chemical denaturation measurements. In thermal denaturation 
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measurements, denaturation of the Fn3 proteins was monitored using circular 
dichroism spectroscopy at the wavelength of 227 mn. All the proteins underwent 
a cooperative transition (Figure 22). From the transition curves, the midpomts of 
to transition CO for me wild-type, D7N and DTK were determined to be 62, 69 
5 and 70 °C in 0.02 M sodium phosphate buffer (pH 7.0) containing 0.1 M sodium 

chloride and 6.2 M urea. Ttas, me mutations increased the T. of wild-type Fn3 

fay 7 8 °C 

" aermcaldenaturationofFnSprotemswasmomtoredusingfluorescence 
enussionfrommesingleTrpresidueofFnS (Figure23). Tne free energies of 
,0 unfolding in tire absence of guanidine Hd (AG°) were determine* to be 7 A 8.1 
and 8 0 kcal/mol for the wild-type, D7N and D7K, respectively (a larger AG 
indicates a higher stability). The two mutants were again found to be more 
stable than the wild-type protein. 

These results show that a point mutation on the surface can significantly 
15 enhance the stability of Fn3. Because these mutations are on the surface, they 
unnimally alter the structure of Fn3, and they can be easily introduced to omer, 
engineered Fn3 proteins. In addition, mutations at Glu 9 and/or Asp 23 also 
enhance the stability of Fn3. Furthermore, mutations at one or more of these 

three residues can be combined. 
20 Thus,Fn3 is the fourth example of a monomeric immunoglobnlin-like 

scaffold that can be used for engineering binding proteins. Successful selection 
ofnovelbmdmgproteimhavealsobeenbasedonrmm^ody^damistatand 

••eamelized- immnnoglobulin VH domain scaffolds (Martin « al, 1994; Daves 
* Eiechmann, 1995; McConaeU & Hocss, 1995). TheFnJ scaffoldhas 
25 adv^gesovermesesys.ems.Bianchier./.repor.edthatthestabffltyofa 

structural characterization of nunibodies has been reported to date. Tendamtstat 
and the VH domain contain disulfide bonds, and thus preparation of correctly 
folded proteins ma, be difficult Davies and Riechmann reported that the yrelds 

Riechmann, 1996). 
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Thus, the Fn3 framework can be used as a scaffold for molecular 
recognition. Its small size, stability and well-characterized structure make Fn3 
an attractive system. In light of the ubiquitous presence of Fn3 in a wide variety 
of natural proteins involved in ligand binding, one can engineer Fn3-based 
5 binding proteins to different classes of targets. 

IV. Reassociation of the Fibronectin Type PI Domain by Fragment 
Complementation 

Specific binding molecules are useful for many purposes. One example 

10 of specific binding molecules is antibodies generated by the immune system. 
When an individual is exposed to a "foreign" target molecule, the individual's 
immune system usually produces antibodies specific for the target molecule. 
Antibodies, or other specific binding molecules, can be useful in laboratory and 
commercial settings as well. At times, particular antibodies can be isolated from 

1 5 animals that have been exposed to certain target molecules. It can also be useful 
to generate artificially assembled libraries of specific binding molecules, which 
are then screened for their abilities to bind to different target molecules. 

Phage display selection (Rader and Barbas 1997; Hoess 2001) and yeast 
two-hybrid assays (Fields and Song 1989; Geyer and Brent 2000) are among the 

20 most widely used experiments for the selection of proteins from a library. 
Protein selection mirrors the process in the immune response that selects 
circulating antibodies having an affinity for a particular antigen. The 
transformation efficiency of the host organism used to make the library, 
however, limits the available size of the library. In order to expand the binding 

25 capabilities and/or efficiencies, mutations can be introduced into a protein 
sequence after an initial selection (Hawkins et al. 1992; Roberts et al. 1996; 
Patten et al. 1996). This method mirrors somatic mutations in the immune 
response in affinity maturation experiments. 

One source of diversity in the immune response lies in the combination 

30 of the light and heavy chains to form an antibody. Proper assembly of a light and 
heavy chain pair is required for the protein to be functional. Successful assembly 
of the heavy and light chains produced from a single vector has been 
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demons (Barbas « al. 1991), and phage display methods have been 
developed ftat make it possMCo "mix and match" .he heavy and light chains* 
produceadiverse set of antibodies (Sblattero andBradbury 2000; Sblattero etal. 
2001) to contrast to immunoglobulins, most engineered binding protems, 
5 i n d „d ffl gmonobodies,arebasedonamono m ericpro.eins(Skerra2000).-nns 

monomenc nature of these engineered binding proteins makes it difficult to 
explore heterodimerization reactions to increase the diversity of a hbrary. 
However, if an engineered binding protdn could be manipulated so mat its two 
separate pieces self-assemble into a functional form, amore diverse library could 
10 beachieved By using such a two-part scheme, two fragments could be 
separatelydive^fiedU.genemtemeirrespectiveUbraries.andthenmetwo 

protein. 

When a pmtein is cleaved in two fragments, individual fragments are 
^ y unfolded, and they often fail to reconstitute me original told when rmxed. 
However, fragments of a number of proteins have been shown to reconstitute 
taloanative-likecomplexCdePratOayandFersht^Kippenetal.^; 
20 Tasavc„andChao.995 ; Ladu m ereta..l997 ; Pe n etiereta 1 ..998 ; Tasaycoe,.. 
2000- Berggard et al. 2001). In order to achieve fragment complementation mm 

25 have been placed in a flexible region of target proteins. 

II , „ff, ^tr-^-^^ni^rotemmtejacaaassavs 

Several screening strategies exploit fragment reeonstitution of protems. 
F or example, the split ubimutin assay (Johnsson and Varshavsky 1994; Ra,uet e. 
30 al 2001) allows in vivo detection of ptotein-protein interactions (Wrtike et al. 
1999) Abacterial fragment complementation assay (Pelletier et al 1998; 
Michnick et al. 2000) similarly uses fragment reconstitute of dihydrofolate 
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reductase to examine protein-protein interactions in E. colL In these assays, two 
proteins of interest are respectively fused to complementary fragments of a 
reporter protein (ubiquitin or dihydrofolate reductase) and successful 
reconstitution of the reporter protein would indicate interaction between the two 
5 proteins.. 

Protein reconstitution to efficiently generate combinatorial libraries 

Many molecular display technologies, where genetic information and 
functional information are physically linked, such as phage display (Kay et al. 

10 1996) and yeast display (Boder and Wittrup 1997) depend on transformation of 
microbes. Such transformation step tends to limit the number of independent 
clones in a library that can be generated in a single transformation reaction to 
-109 in Escherichia coli and -107 in yeast. If one wishes to generate a biological 
combinatorial library where two discrete segments of a protein are diversified, 

15 one would typically need to generate a single DNA vector in which two 

segments are diversified and transform bacteria (or yeast). In this case the library 
size is still limited by the efficiency of the trasformation reaction. One could use 
in vivo recombination reactions to increase the library diversity as demonstrated 
for antibody fragments (Sblattero and Bradbury 2000; Sblattero et al. 2001). 

20 However, when applied to a monomeric protein, such approaches introduce 
artificial amino acid segment in the protein, whose effects on the stability and 
structure are unpredictable. 

The production of diverse combinatorial libraries would be greatly 
simplified, and expanded, if one could reconstitute a binding protein from two 

25 physically separate libraries, as in the combination of light and heavy chains 
described above. A library of a protein reconstituted from two libraries of 
complementary fragments has an effective size of the product of the sizes of the 
two primary libraries. Thus, if one can efficiently combine (and reconstitute) 
two (or more) fragment libraries, the resulting library would have much greater 

30 diversity than the sum of the diversity of the fragment libraries. In this scheme, 
the fragments must reconstitute with high affinity. Mutations introduced into 
either fragment potentially decrease the affinity of reconstitution. If a high 
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affinity reconstitution library could be engineered using a particular scaffold, 
intriguing new opportunities to dramatically increase the library diversity would 
open up. Because fragments of aprotein do not often reconstitute withhigh 
affinity and specificity, experimental studies are needed to explore this 
5 possibility for specific protein systems of interest 

As discussed above, the tenth fibronectin type HI domain of human 
fibronectin (FNfnlO) is a small, monomelic P -sandwich protein, similar to 
immunoglobulins. Small antibody mimics have been made using FNfnlO as a 
scaffold. As discussed herein, mutations are introduced into various loop 
10 regions of the fold. Fragments ofFNMO that were produced by cleavage of a 
peptide bond in the CD loop and EF loop were tested to determine whether 

reconstitution occurs. 

As described above, monobodies are engineered binding proteins using 
the scaffold of the fibronectin type HI domain (FN3). Surface loops connecting 
15 beta-strandsweremodifiedtoconfernovelbindingfunction. Represent 
inventor has further developed the monobody technology and showed that 
monobodies that bind to a given target can be engineered by screening 
combinatorial libraries in which amino acid residues in one or more surface 
loops are diversified. Monobodies are compatible with virtually any molecular 
20 display techniques including, but not limited to, phage display, yeast surface 
display, mRNA display and also yeast two-hybrid techniques. 

As described above, the efficiency of introducing a nucleic acid library 
(transformation) in a host usually limits the achievable size of a biological 
library. For example, one can only construct a phage display library (host: E. 
25 colt) of ~10 9 independent clones and a yeast two-hybrid library (host: yeast) of 
-10 7 independent clones from a single transformation reaction. Theoretically 
libraries containing 10' and -10 7 clones include all possible sequences for only 6 
and 4 randomized positions, respectively. Because the present inventor lypically 
diversified more than six positions, typical monobody libraries contained only a 
30 small fraction of possible sequences. This may lead to a failure to identify a 
monobody that binds to a target, or a failure to isolate the optimal monobody. 
Thus, it is of considerable interest to increase the size of abiological library. 
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In the present invention, a method has been developed to significantly 
increase the size of a monobody library. This method exploits the reconstitution 
of a monobody from two fragments, where each contains one or more functional 
loops for target binding (see Fig. 43). A combinatorial library is made for each 
5 fragment, and then a final, larger library is constructed by combining libraries for 
the fragments. This is conceptually analogous to the formation of 
immunoglobulins from two separate chains, the heavy and light chains. This 
strategy is particularly suited for the yeast surface display and yeast two-hybrid 
method, because yeast cells of opposite mating types can mate efficiently. For 
1 0 example, if one has a library for the N-terminal half of the monobody with a size 
of 10 5 and a library for the C-terminal half with a size of 10 5 , combining these 
two will theoretically yield a library of 10 10 . This is at least 10,000 fold greater 
than the typical size of a single library constructed in yeast. One can apply the in 
vivo recombination techniques (Sblattero and Bradbury 2000; Sblattero et al. 
1 5 2001) to a plasmid vector containing two separate genes, where each gene 
encodes a fragment of a monobody. This is possible because one can insert 
arbitrary DNA sequences between the two genes for the fragments without 
causing deleterious effects on the protein. 

In order to achieve this reconstitution method, it first needed to be 
20 demonstrated that two fragments of FN3 can actually reconstitute. Since 

fragment reconstitution with high affinity does not always occur, experimental 
verification was necessary. First, one needs to decide where to cut the FN3 
scaffold. The CD loop was chosen as the initial cut site. The CD loop is at the 
opposite end of the protein from the BC and FG loops that have been extensively 
25 used for binding. Using a yeast two-hybrid system the inventor confirmed that 
the wild-type N-terminal fragment ("FNABC") interacted with the wild-type and 
mutated C-terminal fragments ("FNDEFG" and its derivatives) (Figure 1). In 
addition, data suggested that the dissociation constant (Kd) between FN ABC and 
FN DEFG was in the single nanomolar range, indicating very tight and specific 
30 interaction (see EXAMPLE XXI). The present results demonstrated that when 
cut in the CD loop, the two fragments of FN3 can reconstitute with high affinity. 
Thus monobody libraries can be constructed using the fragment reconstitution 
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situation, : 
see e.g. 



strategy. . +v _ 

I» certain situations, particular mutations in toe monobody (e.g. tn 4e 

BC, DE or FG .oops) may have de«me»tal effects on reconstitute In such a 
nitispossib.e.oattachaheterodlmerizationmoufCsuchasco.ledco* 

McClain et al., 200.) a. the Oerminus ofFNABC and at the N-terminus 

of fJdEFG to augment the reconstitute affinity of the two fragments. 
Mternativdy.anN-in^canbeattachedtomeendofoneofmetogmentpatr, 

audaC-inteinattached^theendofthe other half oftheftagmen. pan to 

^ustitutetheWndingproteinintoonecontiguouspolypept.deCsee 

,0 eg YamazakietaUSSg). Furthermore, a cystein residue can be introduced » 
^fragmentmsuchaway^adisutndebondisformedbetweenthetwo 

complementaxy fragments. 

T^ere are many other different classes of binding pairs that could 
potentially beusedto augment the reconstitution affinity of xnonobody 
15 fragments. Examples include uie following: 

1. natural proteinases that are known to associate 
coiled coils (Oakley & Kim) 

bMed&list _uids=7789535&dop^Abstract) 
2 a peptide-binding protein and its target peptides 

bMed&list_uids=7510218&dopt=Abstract) 

h ^ ; „cbi.nhn^^^^ 
bMed&Ust>ds=9383403&dopt-Abstract) 

, ftasments of a protein that have been artificially generate! (similar to the 
3 - SfCl discussed extensively in the present specrfication) 

chymotrypsin inhibitor 2 (Ladumer et al. 1997) 
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barnase (Sancho, J. & Fersht, A.R. see 

http://vww.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Re1rieve&db=Pu 
bMed&list_uids=1569553&dopt=Abstract) 
ribonuclease S (S-protein/S-peptide) (Dwyer et al. 2001) 
5 green fluorescence protein (Merkel and Regan 2000) 

4. inteins (Yamazaki et al. 1998) 

Fragments of FN3 as a heterodimerization unit 

10 Complementary fragments of FN3 ("split FN3") can also be exploited as 

heterodimerization motifs that bring two proteins of interest in close proximity. 
Two proteins of interest, X and Y, are each fused to a fragment of FN3. Upon 
association of the FN3 fragments, X and Y are held in close proximity. For this 
purpose, FN3 fragments are derived from the wild-type sequence of FN3, as 

1 5 demonstrated in Example XXH, or from variants of FN3 with increased stability. 
In addition, mutations are introduced such that the new mutant fragments 
associate, but they do not associate with fragments derived from the wild-type 
sequence (see Example XXIII). Multiple sets of such unique binding pairs are 
designed using this strategy. Such pairs can be generated by first introducing a 

20 highly destabilizing mutations in one fragment and then screen a library of the 
other fragment in which appropriate positions are diversified. One can use this 
system to examine effects of bringing two proteins together in cell biology 
(Fujiwara et al. 2002). One can use this system to assemble nanostructures, such 
as on a silicon surface. In the nanotechnology field, there are not many tools to 

25 attach pieces with high selectivity. Having many different building blocks is 
clearly useful when assembling complex structures that require different 
attachment tools. 

The following examples are intended to illustrate but not limit the 
invention. 

30 

EXAMPLE I 
Construction of the Fn3 gene 
A synthetic gene for tenth Fn3 of fibronectin (Fig.l) was designed on the 
basis of amino acid residue 1416-1509 of human fibronectin (Kornblihtt, et al., 
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1985) and its three dimensional structure (Main, et al, 1992). The gene was 
engineered to include convenient restriction sites for mutagenesis and the so- 
called "preferred codons" for high level protein expression (Gribskov, et al. , 
1984) were used. In addition, a glutamine residue was inserted after the N- 

5 terminal methionine in order to avoid partial processing of the N-terminal 

methionine which often degrades NMR spectra (Smith, et al., 1994). Chemical 
reagents were of the analytical grade or better and purchased from Sigma 
Chemical Company and J.T. Baker, unless otherwise noted. Recombinant DNA 
procedures were performed as described in "Molecular Cloning" (Sambrook, et 

10 al, 1989), unless otherwise stated. Custom oligonucleotides were purchased 
from Operon Technologies. Restriction and modification enzymes were from 

New England Biolabs. 

The gene was assembled in the following manner. First, the gene 
sequence (Fig. 5) was divided into five parts with boundaries at designed 
15 restriction sites: fragment 1, Ndel-PstI (oligonucleotides FN1F and FN1R (Table 
2); fragment 2, Pstl-EcoRI (FN2F and FN2R); fragment 3, EcoRI-Sall (FN3F 
and FN3R); fragment 4, Sall-SacI (FN4F and FN4R); fragment 5, SacI-BamHI 
(FN5F and FN5R). Second, for each part, a pair of oligonucleotides which code 
opposite strands and have complementary overlaps of approximately 15 bases 
20 was synthesized. These oligonucleotides were designated FN1F-FN5R and are 
shown in Table 2. Third, each pair (e.g., FN1F and FN1R) was annealed and 
single-strand regions were filled in using the Klenow fragment of DNA 
polymerase. Fourth, the double stranded oligonucleotide was digested with the 
relevant restriction enzymes at the termini of the fragment and cloned into the 
25 pBlueScript SK plasmid (Stratagene) which had been digested with the same 
enzymes as those used for the fragments. The DNA sequence of the inserted 
fragment was confirmed by DNA sequencing using an Applied Biosystems DNA 
sequencer and the dideoxy termination protocol provided by the manufacturer. 

Last, steps 2-4 were repeated to obtain the entire gene. 
30 ' The gene was also cloned into the P ET3a and pET15b(Novagen) vectors 

(pAS45 and pAS25, respectively). The maps of the plasmids are shown in Figs. 
6 and 7. E. coli BL21 (DE3) (Novagen) containing these vectors expressed the 
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Fn3 gene under the control of bacteriophage T7 promotor (Studier, et al., 1990); 
pAS24 expresses the 96-residue Fn3 protein only, while pAS45 expresses Fn3 as 
a fusion protein with poly-histidine peptide (His«tag). High level expression of 
the Fn3 protein and its derivatives in E. coli was detected as an intense band on 
5 SDS-PAGE stained with CBB. 

The binding reaction of the monobodies is characterized quantitatively by 
means of fluorescence spectroscopy using purified soluble monobodies. 

Intrinsic fluorescence is monitored to measure binding reactions. Trp 
fluorescence (excitation at ~290 nm, emission at 300 350 nm) and Tyr 
1 0 fluorescence (excitation at -260 nm, emission at ~303 nm) is monitored as the 
Fn3-mutant solution (<, 100 \iM) is titrated with a ligand solution. When a 
ligand is fluorescent (e.g. fluorescein), fluorescence from the ligand may be 
used. K d of the reaction will be determined by the nonlinear least-squares fitting 
of the bimolecular binding equation. 
1 5 If intrinsic fluorescence cannot be used to monitor the binding reaction, 

monobodies are labeled with fluorescein-NHS (Pierce) and fluorescence 
polarization is used to monitor the binding reaction (Burke et al., 1996). 

EXAMPLE II 

20 Modifications to include restriction sites in the Fn3 gene 

The restriction sites were incorporated in the synthetic Fn3 gene without 
changing the amino acid sequence Fn3. The positions of the restriction sites 
were chosen so that the gene construction could be completed without 
synthesizing long (>60 bases) oligonucleotides and so that two loop regions 

25 could be mutated (including by randomization) by the cassette mutagenesis 
method (i.e., swapping a fragment with another synthetic fragment containing 
mutations). In addition, the restriction sites were chosen so that most sites were 
unique in the vector for phage display. Unique restriction sites allow one to 
recombine monobody clones which have been already selected in order to supply 

30 a larger sequence space. 

EXAMPLE HI 
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Construction of M13 phage display libraries 

A vector for phage display, pAS38 (for its map, see Fig. 8) was 
constructed as follow, TheXbal-BamHI fragment of pET12a encoding me 
signalpeptideofOmpTwasolonedatmeS'endofteFnSgen, TheC- 

5 terminal region (from the FN5F and FN5R oligonucleotides, see Table 2) of *e 
Fn3 gene was replaced with a new fragment consisting of the FN5F and FN5R 
ohgonucleotides (Table 2) which introduced a Mini site and a linker sequence 
formatogateionpro^wimmepmprotemofbacteriophageMU.Agene 

ftagmentcodmgmeC-ternvmddomamofMlSpmwaspeparedtamewtld- 
10 type gene m of M13mpl8 using PCR (Corey, e, a,., 1993) and the fragment was 
tortd at the 3' end of me OmpT-Fn3 fusion gene using me Mlul and Hmdffl 



sites. 



15 



20 



Phages were produced and purified using a helper phage, M13K07, 
accordingto asumdardmemod(Sambrook,,<al, 1989) except mat phage 
nicies were purified by a second polyene glycol precipitation. Successful 
ais pU,yofFn3onfusionphageswasconfinnedby E USA(Harlow & Une, 

1988) using an antibody against fibronectin (Sigma) and a custom anti-FN3 
antibody (Cocalico Biologicals, PA, USA). 

EXAMPLE IV 
Libraries containing loop variegations in the AB loop 

A nucleic acid phage display library having variegation in the AB loop is 
pr eparedbymefoUowingmemods.l^donu^tionisachievedbytheuseof 

v^tedareide 11 tifiedbyexarr^gmeX-rayand>^sr ro c«uresofFn3 
(ProteinDataBankaccessionnumbers, 1FNA and 1TTF, respectively). 
Oligonucleotides containing NNK (N and K here denote an equimolar mixture of 
A T O andCandanequimolarimxnireofQaudT.respectiveMfbrthe 

variegated residues are synthesized (see oligonucleotides BC3, FG2, FG3, and 
30 FG4 in Table 2 for example). The NNK mixture codes for all twenty ammo 
acids and one termination codon (TAG). TAG, however, is suppressed m the E. 
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coli XL-1 blue. Single-stranded DNAs of pAS38 (and its derivatives) are 
prepared using a standard protocol (Sambrook, et ai, 1989). 

Site-directed mutagenesis is performed following published methods (see 
for example, Kunkel, 1985) using a Muta-Gene kit (BioRad). The libraries are 

5 constructed by electroporation of E. coli XL-1 Blue electroporation competent 
cells (200 |il; Stratagene) with 1 p-g of the plasmid DNA using a BTX electrocell 
manipulator ECM 395 1 mm gap cuvette. A portion of the transformed cells is 
plated on an LB-agar plate containing ampicillin (100 |xg/ml) to determine the 
transformation efficiency. Typically, 3 X 10 8 transformants are obtained with 1 

10 p.g of DNA, and thus a library contains 10 8 to 10 9 independent clones. Phagemid 
particles were prepared as described above. 

EXAMPLE V 
Loop variegations in the BC, CD, DE, EF or FG loop 

1 5 A nucleic acid phage display library having five variegated residues 

(residues number 26-30) in the BC loop, and one having seven variegated 
residues (residue numbers 78-84) in the FG loop, was prepared using the 
methods described in Example IV above. Other nucleic acid phage display 
libraries having variegation in the CD, DE or EF loop can be prepared by similar 

20 methods. 

EXAMPLE VI 
Loop variegations in the FG and BC loop 

A nucleic acid phage display library having seven variegated residues 
25 (residues number 78-84) in the FG loop and five variegated residues (residue 
number 26-30) in the BC loop was prepared. Variegations in the BC loop were 
prepared by site-directed mutagenesis (Kunkel, et al.) using the BC3 
oligonucleotide described in Table 1 . Variegations in the FG loop were 
introduced using site-directed mutagenesis using the BC loop library as the 
30 starting material, thereby resulting in libraries containing variegations in both BC 
and FG loops. The oligonucleotide FG2 has variegating residues 78-84 and 



49 



PCT/US03/18030 

WO 03/104418 

„H g o„uo 1 eotideFG4ha S variega«u,g re sidues77- 81 andade l =tio„of« si due S 

82-84. . .. 

A nucleic acid phage display library having five variegated residues 

(residues 78-84) in the FG loop and a three residue deletion (residues 82-84) in 
5 theFOlo.p.andfivevariegatedresiduesCresiduesiMOJinu.eBC^ 

prepared. The shorter FG loop was made in an attempt to reduce the flexibility of 
theFGloop; theloopwasshownU.behigh.yfle.ibleinFnSbytheKMR 
sM die SO fMain,e (0 ai992).AMg U yflexibleloopmaybedisadvan bg eoust„ 
forrmngahMngsitewithahighaffini^Calargeentropyloss is expected upon 
,0 meligandbmding.becausefteflexWe.oopshouldbecomemorengjd). In 
addition, other Fn3 domains (besides human) have shorter FG loops (for 
sequence alignment, see Figure 12 inDiddnson, et al. (1994)). 

p^ndomizationwas achieved by the use of oligonucleotides containing 
degenera.enucleotideseu.ueuce^igonucleoadeBCSforvariegatmgfteBC 

15 loopandoligonucleo„desFG2andFG4forvariegatin g theFGloops). 

Site^i^mu^genesiswasperfonnedfollowingpublishedmemods 

(see for example, Kuukel, 1985). The libraries were constructed by 
eie—forming*. co«XL-! B.ue (Suatagene). TypicaUy a library contain, 
, 0> to 1 0' independent clones. Library 2 contains five variegated residues in me 
M BCloopands.venvariegatedresiduesinuieFG.oop. Library 4 contains five 
variegatedresiduesineachoftheBCandFG loops, and the lengtt, of the FG 
loop was shortened by three residues. 



25 



30 



EXAMPLE Vn 
fd phage display libraries constructed with loop variegations 

Phage display libraries are constructed using the fd phage as the generic 
' vector The Fn3 gene is inserted in fUSE5 (Parmley & Smith, 1988) using Sffl 
action sites which are introduced at me 5' and 3' ends of the Fn3 gene usmg 
PCR The expression of this phage results in the display of the fusion pm 
protein onmesurfaceofthefd phage. Variegations in theFn3 loops are 
introduced using site-directed mutagenesis as described hereinabove, or by 
subdoning the Fn3 libraries constructed in M13 phage into the fUSES vector. 
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EXAMPLE VIII 
Other phage display libraries 

T7 phage libraries (Novagen, Madison, WI) and bacterial pili expression 
systems (Invitrogen) are also useful to express the Fn3 gene. 

5 

EXAMPLE IX 

Isolation of polypeptides which bind to macromolecular structures 

The selection of phage-displayed monobodies was performed following 
the protocols of Barbas and coworkers (Rosenblum & Barbas, 1995). Briefly, 

10 approximately 1 p.g of a target molecule ("antigen") in sodium carbonate buffer 
(100 mM, pH 8.5) was immobilized in the wells of a microtiter plate (Maxisorp, 
Nunc) by incubating overnight at 4°C in an air tight container. After the removal 
of this solution, the wells were then blocked with a 3% solution of BSA (Sigma, 
Fraction V) in TBS by incubating the plate at 37°C for 1 hour. A phagemid 

15 library solution (50 containing approximately 10 12 colony formingunits (cfu) 
of phagemid was absorbed in each well at 37°C for 1 hour. The wells were then 
washed with an appropriate buffer (typically TBST, 50 mM Tris-HCl (pH 7.5), 
150 mMNaCl, and 0.5% Tween20) three times (once for the first round). 
Bound phage were eluted by an acidic solution (typically, 0.1 M glycine-HCl, pH 

20 2.2; 50 pi) and recovered phage were immediately neutralized with 3 pi of Tris 
solution. Alternatively, bound phage were eluted by incubating the wells with 50 
pi of TBS containing the antigen (1 - 10 pM). Recovered phage were amplified 
using the standard protocol employing the XLlBlue cells as the host (Sambrook, 
et al). The selection process was repeated 5-6 times to concentrate positive 

25 clones. After the final round, individual clones were picked and their binding 
affinities and DNA sequences were determined. 

The binding affinities of monobodies on the phage surface were 
characterized using the phage ELISA technique (Li, et aL 9 1995). Wells of 
microtiter plates (Nunc) were coated with an antigen and blocked with BSA. 

30 Purified phages ( 1 0 8 - 1 0 1 1 cfu) originating from a single colony were added to 
each well and incubated 2 hours at 37°C. After washing wells with an 
appropriate buffer (see above), bound phage were detected by the standard 
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ELISA protocol using anti-M13 antibody (rabbit, Sigma) and anti-rabbit Ig- 
peroxidase conjugate (Pierce). Colorimetric assays were performed using Turbo- 
TMB (3,3',5,5'-tetramethylbenzidine, Pierce) as a substrate. 

The binding affinities of monobodies on the phage surface were further 

5 characterized^^ 

1996). In this experiment, phage ELISA is performed in the same manner as 
described above, except that the phage solution contains a ligand at varied 
concentrations. The phage solution was incubated a 4°C for one hour prior to 

10 phage displayed monobodies are estimated by the decrease in ELISA signal as 
the free ligand concentration is increased. 

After preliminary characterization of monobodies displayed on the 
surface of phage using phage ELISA, genes for positive clones were subcloned 
into the expression vector P AS45. E. coli BL21(DE3) (Novagen) was 
15 transformed with an expression vector (pAS45 and its derivatives). Cells were 
grown in M9 minimal medium and M9 medium supplemented with 
Bactotryptone (Difco) containing ampicillin (200 ug/ml). For isotopic labeling, 
NH 4 C1 and/or ,3 C glucose replaced unlabeled components. Stable isotopes 
werepurchasedfromlsotecandCambridgelsotopeLabs. 500 ml medium in a 2 
20 1 baffle flask was inoculated with 10 ml of overnight culture and agitated at 
approximately 140 rpm at 37°C. IPTG was added at a final concentration of 1 
mM to induce protein expression when OD(600 nm) reached approximately 1 .0. 
The cells were harvested by centrifugation 3 hours after the addition of IPTG and 

kept frozen at -70°C until used. 
25 Fn3 and monobodies with His-tag were purified as follows. Cells were 

suspended in 5 ml/(g cell) of 50 mM Tris (pH 7.6) containing 1 mM 
phenylmethylsulfonyl fluoride. HEL (Sigma, 3X crystallized) was added to a 
final concentration of 0.5 mg/ml. After incubating the solution for 30 min at 
3 7 o C , it was sonicated so as to cause cell breakage three times for 30 seconds on 
30 ice. Celldebriswasremovedbyc^ 

2B centrifuge using an SS-34 rotor. Concentrated sodium chloride is added to 
the solution to a final concentration of 0.5 M. The solution was then applied to a 
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1 ml HisTrap™ chelating column (Pharmacia) preloaded with nickel chloride 
(0.1 M, 1 ml) and equilibrated in the Tris buffer (50 mM, pH 8.0) containing 0.5 
M sodium chloride. After washing the column with the buffer, the bound protein 
was eluted with a Tris buffer (50 mM, pH 8.0) containing 0.5 M imidazole. The 

5 His-tag portion was cleaved off, when required, by treating the fusion protein 
with thrombin using the protocol supplied by Novagen (Madison, WI). Fn3 was 
separated from the His*tag peptide and thrombin by a Resources®column 
(Pharmacia) using a linear gradient of sodium chloride (0 - 0.5 M) in sodium 
acetate buffer (20 mM, pH 5.0). 

1 0 Small amounts of soluble monobodies were prepared as follows. XL-1 

Blue cells containing pAS38 derivatives (plasmids coding Fn3-p]H fusion 
proteins) were grown in LB media at 37°C with vigorous shaking until OD(600 
nm) reached approximately 1 .0; IPTG was added to the culture to a final 
concentration of 1 mM, and the cells were further grown overnight at 37°C. 

1 5 Cells were removed from the medium by centrifugation, and the supernatant was 
applied to a microtiter well coated with a ligand. Although XL-1 Blue cells 
containing pAS38 and its derivatives express FN3-pDI fusion proteins, soluble 
proteins are also produced due to the cleavage of the linker between the Fn3 and 
pill regions by proteolytic activities of E. coli (Rosenblum & Barbas, 1995). 

20 Binding of a monobody to the ligand was examined by the standard ELIS A 
protocol using a custom antibody against Fn3 (purchased from Cocalico 
Biologicals, Reamstown, PA). Soluble monobodies obtained from the 
periplasmic fraction of E. coli cells using a standard osmotic shock method were 
also used. 

25 

EXAMPLE X 
Ubiquitin binding monobody 

Ubiquitin is a small (76 residue) protein involved in the degradation 
pathway in eurkaryotes. It is a single domain globular protein. Yeast ubiquitin 
30 was purchased from Sigma Chemical Company and was used without further 
purification. 
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Libraries 2 and 4, described in Example VI above, were used to select 

M1 owedbybloc,dngwi m BSA(3% in TBS). Panning was performed - 
5 Lbedabove.mmefirs.tworoWs.Ugofub^wasmmro^p, 

L andboundpnasewereeWewimanaeidicsoMon. From the *rrd ,„ me 

Bindrngofseiected clones was tes^fetinmepoiyeWmcde,..,., 
10 ^iso.a.mgindlvidualclones. SeleeredclcmestomallUbrari^ed 
^canrbindlngtoubi^nn. These results are S bown m Ftg^ T* 

completely by less man 30 uM soluble »bim,iun in the compeuuon BUSA 

compiewiy y f ubiqultin . 

experiments (seeFig. 10). The sequences of the BC and FG lo p 
15 binding monobodies is shown in Table 4. 
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Table 4. Sequences of ubiquitin-binding monobodies 



Name 


BC loop 


FG loop 


Occurrence (if 
more than one") 


211 


CARRA 

(SEQ ID NO:31) 


RWDPLAK 
(SEQ ID NO:32) 


2 


212 


CWRRA 
(SEQ ID NO:33) 


RWVGLAW 
(SEQ ID NO:34) 




213 


CKHRR 

(SEQ ID NO:35) 


FADLWWR 
(SEQ ID NO:36) 




214 


CRRGR 

(SEQ ID NO:37) 


RGFMWLS 
(SEQ ID NO:38) 




215 


CNWRR 
(SEQ ID NO:39) 


RAYRYRW 
(SEQ ID NO:40) 


• 


411 


SRLRR 

(SEQ ID NO:41) 


PPWRV 

(SEQ ID NO:42) 


9 


422 


ARWTL 
(SEQIDNO:43) 


RRWWW 
(SEQ ID NO:44) 




424 


GQRTF 

(SEQIDNO:45) 


RRWWA 
(SEQIDNO:46) 





The 41 1 clone, which was the most enriched clone, was characterized 
using phage ELISA. The 41 1 clone showed selective binding and inhibition of 
1 5 binding in the presence of about 1 0 jiM ubiquitin in solution (Fig. 11). 

EXAMPLE XI 
Methods for the immobilization of small molecules 

Target molecules were immobilized in wells of a microtiter plate 
20 (Maxisorp, Nunc) as described hereinbelow, and the wells were blocked with 
BS A. In addition to the use of carrier protein as described below, a conjugate of 
a target molecule in biotin can be made. The biotinylated ligand can then be 
immobilized to a microtiter plate well which has been coated with streptavidin. 
In addition to the use of a carrier protein as described below, one could 
25 make a conjugate of a target molecule and biotin (Pierce) and immobilize a 
biotinylated ligand to a microtiter plate well which has been coated with 
streptavidin (Smith and Scott, 1993). 
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wdl . Amatively, m «hods of chemica, - * 

5 EXAMPLE XD 

Fluorescein binding monobody 

. thr selection of antibodies from 
Fluoresceinhasbeenusedasatargetforfheselect.on 

^bina^HbratiesCBarbas,.^. » KHS-flnotescein was ob^ 
10 ^tnBSMSisn.). .wo^off— ^-JJ- 

The selection process was repeated 5-6 times 
cl0 nes m t*se X periment,fre P ha« 

15 mixture (BSA, cytocbrom , n mi nutes prior to the addition to ligand 

mg /ml each) at room temperature for 30 minutes, prior 

, u *. were eluted in TBS <#ntaining 10 u-M soluble 

-v,- n ffinities(seebelow)andDNAsequence S were 
pickedandmekbmdmgaffmitiesCseeDeio , 

20 determined. 
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Table 5. Clones from Library #2 



5 





BC 


FG 


WT 


AVTVR (SEQBDNO:47) 


RGDSPAS (SEQ ID NO:48) 




pLB24.1 


CNWRR (SEQ ID NO:49) 


RAYRYRW (SEQ ID NO:50) 


pLB24.2 


CMWRA (SEQ ID NO:51) 


RWGMLRR (SEQ ID NO:52) 


pLB24.3 


ARMRE (SEQ ID NO:53) 


RWLRGRY (SEQ ID NO:54) 


pLB24.4 


CARRR(SEQIDNO:55) 


RRAGWGW (SEQ ID NO:56) 


pLB24.5 


CNWRR (SEQ ID NO:57) 


RAYRYRW (SEQ IDNO:58) 


pLB24.6 


RWRER(SEQIDNO:59) 


RHPWTER (SEQ ID NO:60) 


pLB24.7 


CNWRR (SEQ ID NO:61) 


RAYRYRW (SEQ ID NO:62) 


pLB24.8 


ERRVP(SEQIDNO:63) 


RLLLWQR (SEQ ID NO:64) 


pLB24.9 


GRGAG (SEQ ID NO:65) 


FGSFERR (SEQ ID NO:66) 


pLB24.11 


CRWTR(SEQE)NO:67) 


RRWFDGA (SEQ ID NO:68) 


pLB24.12 


CNWRR (SEQ ID NO:69) 


RAYRYRW (SEQ ID NO:70) 





Clones from Library #4 



WT 


AVTVR (SEQ ID NO:71) 


GRGDS (SEQIDNO:72) 




pLB25.1 


GQRTF (SEQ ID NO:73) 


RRWWA (SEQ ID NO:74) 


pLB25.2 


GQRTF (SEQ ED NO:75) 


RRWWA (SEQ DO NO:76) 


pLB25.3 


GQRTF (SEQ ID NO:77) 


RRWWA (SEQ ID NO:78) 


pLB25.4 


LRYRS(SEQIDNO:79) 


GWRWR (SEQ ID NO:80) 


pLB25.5 


GQRTF (SEQ ED NO:81) 


RRWWA (SEQ ID NO:82) 


pLB25.6 


GQRTF (SEQ ID NO:83) 


RRWWA (SEQ ID NO:84) 


pLB25.7 


LRYRS (SEQ ID NO:85) 


GWRWR (SEQ ID NO:86) 


pLB25.9 


LRYRS (SEQ ID NO:87) 


GWRWR (SEQ ID NO:88) 


pLB25.11 


GQRTF (SEQ ID NO:89) 


RRWWA (SEQ ID NO:90) 


PLB25.12 


LRYRS (SEQ ID NO:91) 


GWRWR (SEQ ID NO:92> 



Preliminary characterization of the binding affinities of selected clones 
were performed using phage ELISA and competition phage ELISA (see Fig. 12 
35 (Fluorescein-1) and Fig. 13 (Fluorescein-2)). The four clones tested showed 



57 



WO 03/104418 



PCTAJS03/18030 



15 



specmcbindmgtomeligand-coated weHs, and to binding reactions are 
inhibited by soluble ftaorescein (see Fig. B). 

EXAMPLE XIII 
Digoxigenin binding monobody 

5 Digoxigerin-3-O-methyl-cartonyl-e-^ocanromc acid-NHS 

(Boehrmger Mannheim) isusedto prepare a digoxigenin-BS A conjugate. The 

10 usedforpanning. Panning is repeated 5 «, 6 tunes to enrich binding ciones 
Because digoxigenin is sparing,, so.uble in aqueous solution, bound phages are 
eiuted from the well using acidic solution. See ExampleXIV. 

EXAMPLE XIV 
TSAC (transition state analog compound) binding monobodies 

Carbonatehydrolyzmgrnonobodies are selected as foUows. A transition 

^anArbuzovreactionasdescribedpreviouslyaacobsandSchultz, 1987 . The 

phosphonate is ften coupled to me carrier protein, BSA, using carbodiinude 
20 MowedbyexhausuvediaiysisaacobsandS^lSST). The hapten-BSA 
conjugateis immobilized in.he wells of a microtiter plate and monobody 

selection is performed as described above. Catalytic activities of seiected 
nronobc^esaretestedustog^nitrophenylcaAonateasmesubstrat, 

Other haptens useful to produce catalytic monobodies are summanzed m 
25 H . Suzuki (1994) and inN.R. Thomas (1994). 

EXAMPLE XV 
NMR characterization of Fn3 and comparison of the Fn3 
secreted by yeast with that secreted by E. coli 
Nuclear magnetic resonance (NMR) experiments are performed to 
identify the contact surface between FnAb and a target molecule, «*. 
monobodies to fluorescein, ubiquitin, RNaseA and soluble derivatives of 
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digoxigenin. The information is then be used to improve the affinity and 
specificity of the monobody. Purified monobody samples are dissolved in an 
appropriate buffer for NMR spectroscopy using Amicon ultrafiltration cell with a 
YM-3 membrane. Buffers are made with 90 % H 2 O/10 % D 2 0 (distilled grade, 
5 Isotec) or with 100 % D 2 0. Deuterated compounds (eg. acetate) are used to 
eliminate strong signals from them. 

NMR experiments are performed on a Varian Unity ENOVA 600 
spectrometer equipped with four RP channels and a triple resonance probe with 
pulsed field gradient capability. NMR spectra are analyzed using processing 
1 0 programs such as Felix (Molecular Simulations), nmrPipe, PIPP, and CAPP 
(Garrett, etal, 1991; Delaglio, etal, 1995) on UNIX workstations. Sequence 
specific resonance assignments are made using well-established strategy using a 
set of triple resonance experiments (CBCA(CO)NH and HNCACB) (Grzesiek & 
Bax, 1 992; Wittenkind & Mueller, 1 993). 
15 Nuclear Overhauser effect (NOE) is observed between ! H nuclei closer 

than approximately 5 A, which allows one to obtain information on interproton 
distances. A series of double- and triple-resonance experiments (Table 6; for 
recent reviews on these techniques, see Bax & Grzesiek, 1993 and Kay, 1995) 
are performed to collect distance (i.e. NOE) and dihedral angle (J-coupling) 
20 constraints. Isotope-filtered experiments are performed to determine resonance 
assignments of the bound ligand and to obtain distance constraints within the 
ligand and those between FnAb and the ligand. Details of sequence specific 
resonance assignments and NOE peak assignments have been described in detail 
elsewhere (Clore & Gronenborn, 1991; Pascal, et al, 1994b; Metzler, et al, 
25 1996). 

Table 6. NMR experiments for structure characterization 

Experiment Name Reference 
1. reference spectra 

30 2D- l H, 15 N-HSQC (Bodenhausen & Ruben, 1980; Kay, et al. 9 1992) 

2D- 1 !!, I3 C-HSQC (Bodenhausen & Ruben, 1980; Vuister & Bax, 

1992) 
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2. backbone and side chain resonance assignments of »C/' ^-labeled protein 

3D-CBCA(CO)NH (Grzesiek & Bax, 1992) 

3D-HNCACB (Wittenkind & Mueller, 1993) 

3D-C(CO)NH (^gan « al, 1992; Grzesiek et al, 1993) 
3D-H(CCO)NH 

3D-HBHA(CBCACO)NH (Grzesiek & Bax, 1993) 

3D-HCCH-TOCSY (Kay et al, 1993) 
3D-HCCH-COSY et al > 1991) 

3D-'H, ,5 N-TOCSY-HSQC (Zhang et al, 1994) 

2D-HB(CBCDCE)HE (Yamazaki et al, 1993) 



3. resonance assignments of unlabeled ligand 

2D-isotope-filtered 'H-TOCSY 

15 2D-isotope-filtered 'H-COSY 

2D-isotope-filtered 'H-NOESY (Dcura & Bax, 1992) 

4. structural constraints 
within labeled protein 

20 3D-'H, I5 N-NOESY-HSQC (Zhang et al, 1994) 

4D-'H 13 C-HMQC-NOESY-HMQC (Vuister et al, 1993) 

4D-'H, 13 C, "N-HSQC-NOESY-HSQC (Muhandiram et al, 1993; Pascal et al, 1994a) 
within unlabeled ligand 

2D-isotope-filtered 'H-NOESY (Ikura & Bax, 1992) 
25 interactions between protein and ligand 
3D-isotope-filtered 'H, "N-NOESY-HSQC 
3D-isotope-filtered 'H, "C-NOESY-HSQC (Lee et al, 1994) 

5. dihedral angle constraints 
30 J-molulated'H^N-HSQC (Billeter etal, 1992) 

3D-HNHB (Archer* a/., 1991) 

Backbone 'H, 15 N and ,3 C resonance assignments for a monobody are 
compared to those for wild-type Fn3 to assess structural changes in the mutant. 
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Once these data establish that the mutant retains the global structure, structural 
refinement is performed using experimental NOE data. Because the structural 
difference of a monobody is expected to be minor, the wild-type structure can be 
used as the initial model after modifying the amino acid sequence. The 
5 mutations are introduced to the wild-type structure by interactive molecular 
modeling, and then the structure is energy-minimized using a molecular 
modeling program such as Quanta (Molecular Simulations). Solution structure 
is refined using cycles of dynamical simulated annealing (Nilges et al, 1988) in 
the program X-PLOR (Brunger, 1992). Typically, an ensemble of fifty structures 
10 is calculated. The validity of the refined structures is confirmed by calculating a 
fewer number of structures from randomly generated initial structures in X- 
PLOR using the YASAP protocol (Nilges, et ai 9 1991). Structure of a 
monobody-ligand complex is calculated by first refining both components 
individually using intramolecular NOEs, and then docking the two using 
1 5 intermolecular NOEs. 

For example, the l H, 15 N-HSQC spectrum for the fluorescein-binding 
monobody LB25.5 is shown in Figure 14. The spectrum shows a good 
dispersion (peaks are spread out) indicating that LB25.5 is folded into a globular 
conformation. Further, the spectrum resembles that for the wild-type Fn3, 
20 showing that the overall structure of LB25.5 is similar to that of Fn3. These 
results demonstrate that ligand-binding monobodies can be obtained without 
changing the global fold of the Fn3 scaffold. 

Chemical shift perturbation experiments are performed by forming the 
complex between an isotope-labeled FnAb and an unlabeled ligand. The 
25 formation of a stoichiometric complex is followed by recording the HSQC 

spectrum. Because chemical shift is extremely sensitive to nuclear environment, 
formation of a complex usually results in substantial chemical shift changes for 
resonances of amino acid residues in the interface. Isotope-edited NMR 
experiments (2D HSQC and 3D CBCA(CO)NH) are used to identify the 
30 resonances that are perturbed in the labeled component of the complex; Le. the 
monobody. Although the possibility of artifacts due to long-range 
conformational changes must always be considered, substantial differences for 
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+„i 1 QQ'V Gronenborn & Gore, 1993). 

-"ZZ. — - — — ■ 

. mn measurements. HX rates for each amide proton are 

5 measured for N labeled m monobody 

tn rpdnlt in decreased amide Ha rates 

.ocourforavanaUetoefono^g^sferoftt.econ.piexU.DA^ 

HWloweringpHandtheHSQCspeotram.srecoidedat 

10 I^™, S . bl ea»^ 1 u W ea tl owpH,sa^ g 
low pH where amide BX. is slow, r. 

the prerequisite for the experiments. 

EXAMPLE XVI 
« and UM of ^Display System Specific for TMouitta 

characterized. ^ n i av 0 fFr* was performed as in 

aphagemW ^ ipapiage(Bass€(a , ; 
lading me wdd-typepffiw" V 

1990 ). T1-. .^p-M^W^ „. 

nf Fn3 displayed on the surface. The surface mspi y 

of FnJ display Qtlt :v ftdv Only phages containing the Fn3- 

25 detectedbyELISA using an anti-Fn3 antibody. Onlyp 

. « i.m Random sequences were introduced 
of Fn3 was constructed as in Example HL Random s q 

., lipe , 96 _ 3{ y> were also randomized in the BC loop in 
oftheFGloop. Five residues (26-3U) were 
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order to provide a larger contact surface with the target molecule. Thus, the 
resulting library contains five randomized residues in each of the BC and FG 
loops (Table 7). This library contained approximately 10 8 independent clones. 

5 Library Screening 

Library screening was performed using ubiquitin as the target molecule. 
In each round of panning, Fn3-phages were absorbed to a ubiquitin-coated 
surface, and bound phages were eluted competitively with soluble ubiquitin. The 
recovery ratio improved from 4.3 x 10' 7 in the second round to 4.5 x 10" 6 in the 

10 fifth round, suggesting an enrichment of binding clones. After five founds of 
panning, the amino acid sequences of individual clones were determined (Table 
7). 



Table 7. Sequences in the variegated loops of enriched clones 



Name 


BC loop 


FG loop 


Frequency 


Wild 
Type 


GCAGTTACCGTGCGT 
(SEQEDNO:93) 
AlaValThrValArg 
(SEQIDNO:94) 


GGCCGTGGTGACAGCCCAGCGAGC 
(SEQ ID NO:95) 
GlyArgGlyAspSerProAlaSer 
(SEQIDNO:96) 




Library" 


NNOINKNNKNNKNNK 
X X X X X 


NNDKNNK1WKNNKNNK 

X X X X X (deletion) 




clone 1 
(Ubi4) 


TCGAGGTTGCGGCGG 
(SEQE>NO:97) 
SerArgLeuArgArg 
(SEQIDNO:98) 


CCGCCGTGGAGGGTG 
(SEQE>NO:99) 
ProProTrpArgVal 
(SEQ ID NO: 100) 


9 


clone2 


GGTCAGCGAACTTTT 
(SEQIDNO:101) 
GlyGlnArgThrPhe 
(SEQ ID NO: 102) 


AGGCGGTGGTGGGCT 
(SEQ ID NO: 103) 
ArgArgTrpTrpAla 
(SEQ ID NO: 104) 


1 


clone3 


GCGAGGTGGACGCTT 
(SEQIDNO:105) 
AlaArgTrpThrLeu 
(SEQEDNO:106) 


AGGCGGTGGTGGTGG 
(SEQ ID NO: 107) 
ArgArgTrpTrpTrp 
(SEQ ID NO:108) 


1 



0 N denotes an equimolar mixture of A, T, G and C; K denotes an equimolar mixture of G and T. 
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A clone, dubbed Ubi4, dominated the enriched pool of Fn3 variants. Therefore, 
further investigation was focused on this Ubi4 clone. Ubi4 contains four 
mutations in the BC loop (Arg 30 in the BC loop was conserved) and five 
5 mutations and three deletions in the FG loop. Thus 13% (12 out of 94) of the 
residues were altered in Ubi4 from the wild-type sequence. 

Figure 15 shows a phage ELIS A analysis ofUbi4. The Ubi4 phage binds 
to the target molecule, ubiquitin, with a significant affinity, while a phage 
displaying the wild-type Fn3 domain or a phase with no displayed molecules 
10 show little detectable binding to ubiquitin (Figure 15a). In addition, the Ubi4 
phage showed a somewhat elevated level of background binding to the control 
surface lacking the ubiquitin coating. A competition ELIS A experiments shows 
the IC 50 (concentration of the free ligand which causes 50% inhibition of 
bindmg)ofmebindmgreactionisapproximately5jiM(Fig. 15b). BSA, bovine 
15 ribonuclease A and cytochrome C show little inhibition of the Ubi4-ubiquitin 
binding reaction (Figure 15c), indicating that the binding reaction of Ubi4 to 
ubiquitin does result from specific binding. 



rwarterization of " Mutant. Fn3 Protein 
20 The expression system yielded 50-100 mg Fn3 protein per liter culture. 

A similar level of protein expression was observed for the Ubi4 clone and other 

mutant Fn3 proteins. 

Ubi4-Fn3 was expressed as an independent protein. Though a majority 

of Ubi4 was expressed in Rcoh as a soluble protein, its solubility was found to 
25 be significantly reduced as compared to that of wild-type Fn3. Ubi4 was soluble 

up to -20 uM at low pH, with much lower solubility at neutral pH. This 

solubility was not high enough for detailed structural characterization using 

NMR spectroscopy or X-ray crystallography. 

The solubility of the Ubi4 protein was improved by adding a solubility 
30 tail, GKKGK (SEQ ID NO:109), as a C-terminal extension. The gene for Ubi4- 

Fn3 was subclpned into the expression vector pAS45 using PCR. The C- 

terminal solubilization tag, GKKGK (SEQ ID NO:109), was incorporated in this 
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step. E. coli BL21 (DE3) (Novagen) was transformed with the expression vector 
(pAS45 and its derivatives). Cells were grown in M9 minimal media and M9 
media supplemented with Bactotryptone (Difco) containing ampicillin (200 
jig/ml). For isotopic labeling, 15 N NH 4 C1 replaced unlabeled NH 4 C1 in the 

5 media. 500 ml medium in a 2 liter baffle flask was inoculated with 10 ml of 
overnight culture and agitated at 37°C. IPTG was added at a final concentration 
of 1 mM to initiate protein expression when OD (600 nm) reaches one. The 
cells were harvested by centrifugation 3 hours after the addition of IPTG and 
kept frozen at -70°C until used. 

10 Proteins were purified as follows. Cells were suspended in 5 ml/(g cell) 

of Tris (50 mM, pH 7.6) containing phenylmethylsulfonyl fluoride (1 mM). Hen 
egg lysozyme (Sigma) was added to a final concentration of 0.5 mg/ml. After 
incubating the solution for 30 minutes at 37°C 5 it was sonicated three times for 
30 seconds on ice. Cell debris was removed by centrifugation. Concentrated 

15 sodium chloride was added to the solution to a final concentration of 0.5 ML The 
solution was applied to a Hi-Trap chelating column (Pharmacia) preloaded with 
nickel and equilibrated in the Tris buffer containing sodium chloride (0.5 M). 
After washing the column with the buffer, histag-Fn3 was eluted with the buffer 
containing 500 mM imidazole. The protein was further purified using a 

20 Resources column (Pharmacia) with a NaCl gradient in a sodium acetate buffer 
(20mM,pH4.6). 

With the GKKGK (SEQ ID NO: 109) tail, the solubility of the Ubi4 
protein was increased to over 1 mM at low pH and up to -50 p,M at neutral pH. 
Therefore, further analyses were performed on Ubi4 with this C-terminal 

25 extension (hereafter referred to as Ubi4-K). It has been reported that the 

solubility of a minibody could be significantly improved by addition of three Lys 
residues at the - or C-termini (Bianchi et al 9 1994). In the case of protein Rop, a 
non-structured C-terminal tail is critical in maintaining its solubility (Smith et 
aL 9 1995). 

30 Oligomerization states of the Ubi4 protein were determined using a size 

exclusion column. The wild-type Fn3 protein was monomeric at low and neutral 
pH's. However, the peak of the Ubi4-K protein was significantly broader than 
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that of wild-type Fn3, and eluted after the wild-type protein. This suggests 
interactions betweenUbi4-K and the column material, precludingthe use of size 
exclusion chromatography to determine the oligomerization state of UbM. NMR 
studies suggest that the protein is monomeric at low pH. 

5 The Ubi4-K protein retained a binding affinity to ubiquitin as judged by 

ELISA (Figure 1 5d). However, an attempt to determine the dissociation constant 
using a biosensor (Affinity Sensors, Cambridge, U.K.) failed because of high 
background binding ofUbi4-K-Fn3 to the sensor matrix. This matrix mamly 
consists of dextran, consistent with the observation that interactions between 

10 Ubi4-K interacts with the cross-linked dextran of the size exclusion column. 

Example XVII 
Stability Measurements of Monobodies 

GuarndmehydrocMoride(Gu^^ 
15 reactionswerefoUowedbymert Experiments 
were performed on a Spectronic AB-2 spectrofluorometer equipped wife a 
motor-driven syringe (Hamilton Co.). The cuvette temperature was kept at 30 C. 
The spectrofluorometer and the syringe were controlled by a single computer 
using ahome-built interface. This system automatically records a series of 
20 spectra following GuHCl titration. An experiment started with a 1.5 ml buffer 
solution containing 5 uM protein. An emission spectrum (300-400 nm; 
excitation at 290 nm) was recorded following a delay (3-5 minutes) after each 
injection (50 or 1 00 pi) of a buffer solution containing GuHCl. These steps were 
repeated until the solution volume reached the full capacity of a cuvette (3.0 ml). 
25 Fluorescence intensities were normalized as ratios to the intensity at an 

isofluorescent point which was determined in separate experiments. Unfoldmg 
curves were fitted with a two-state model using a nonlinear least-squares routine 
(Santoro & Bolen, 1988). No significant differences were observed between 
experiments with delay times (between an injection and the start of spectrum 
30 acquisition) of 2 minutes and 1 0 minutes, indicating that the unfolding/refoldmg 
reactions reached close to an equifibrium at each concentration point within the 
delay times used. 
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Conformational stability of Ubi4-K was measured using above-described 
GuHCl-induced unfolding method. The measurements were performed under 
two sets of conditions; first at pH 3.3 in the presence of 300 mM sodium 
chloride, where Ubi4-K is highly soluble, and second in TBS, which was used 

5 for library screening. Under both conditions, the unfolding reaction was 
reversible, and we detected no signs of aggregation or irreversible unfolding. 
Figure 16 shows unfolding transitions of Ubi4-K and wild-type Fn3 with the N- 
terminal (his) 6 tag and the C-terminal solubility tag. The stability of wild-type 
Fn3 was not significantly affected by the addition of these tags. Parameters 

10 characterizing the unfolding transitions are listed in Table 8. 



Table 8. Stability parameters for Ubi4 and wild-type Fn3 as determined by 
GuHCl-induced unfolding 



Protein 


AG 0 (kcal mol' 1 ) 


m G (kcal mol-' M' 1 ) 


Ubi4 (pH 7.5) 


4.8 ±0.1 


2.12 ±0.04 


Ubi4 (pH 3.3) 


6.5 ± 0.1 


2.07 ±0.02 


Wild-type (pH 7.5) 


7.2 ±0.2 


1.60 ±0.04 


Wild-type (pH 3.3) 


11.2±0.1 


2.03 ± 0.02 



20 

AG 0 is the free energy of unfolding in the absence of denaturant; m G is the 
dependence of the free energy of unfolding on GuHCl concentration. For 
solution conditions, see Figure 4 caption. 

25 Though the introduced mutations in the two loops certainly decreased the 
stability of Ubi4-K relative to wild-type Fn3, the stability of Ubi4 remains 
comparable to that of a "typical" globular protein. It should also be noted that 
the stabilities of the wild-type and Ubi4-K proteins were higher at pH 3.3 than at 
pH 7.5. 

30 The Ubi4 protein had a significantly reduced solubility as compared to 

that of wild-type Fn3, but the solubility was improved by the addition of a 
solubility tail. Since the two mutated loops include the only differences between 
the wild-type and Ubi4 proteins, these loops must be the origin of the reduced 
solubility. At this point, it is not clear whether the aggregation of Ubi4-K is 
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caused by interactions between the loops, or by interactions between the loops 
and the invariable regions of the Fn3 scaffold. 

The Ubi4-K protein retained the global fold of Fn3, showing that this 
scaffold can accommodate a large number of mutations in the two loops tested. 
5 Though the stability of the Ubi4-K protein is significantly lower than that of the 
wild-type Fn3 protein, the Ubi4 protein still has a conformational stability 
comparable to those for small globular proteins. The use of a highly stable 
domain as a scaffold is clearly advantageous for introducing mutations without 
affecting the global fold of the scaffold. In addition, the GuHCl-induced 
10 unfolding of the Ubi4 protein is almost completely reversible. This allows the 
preparation of a correctly folded protein even when a Fn3 mutant is expressed in 
a misfolded form, as in inclusion bodies. The modest stability of Ubi4 in the 
conditions used for library screening indicates that Fn3 variants are folded on the 
phage surface. This suggests that a Fn3 clone is selected by its binding affinity 
15 in the folded form, not in a denatured form. Dickinson et al proposed that Val 
29 and Arg 30 in the BC loop stabilize Fn3. Val 29 makes contact with the 
hydrophobic core, and Arg 30 forms hydrogen bonds with Gly 52 and Val 75. In 
Ubi4-Fn3, Val 29 is replaced with Arg, while Arg 30 is conserved. The FG loop 
was also mutated in the library. This loop is flexible in the wild-type structure, 
20 and shows a large variation in length among human Fn3 domains (Main et al, 
1992). These observations suggest that mutations in the FG loop may have less 
impact on stability. In addition, meN-terminal tail of Fn3 is adjacent to the 
molecular surface formed by the BC and FG loops (Figure 1 and 17) and does 
not form a well-defined structure. Mutations in the N-terminal tail would not be 
25 expected to have strong detrimental effects on stability. Thus, residues in the N- 
terminal tail may be good sites for introducing additional mutations. 



Example XVIII 
NMR Spectroscopy of Ubi4-Fn3 
Ubi4-Fn3 was dissolved in [ 2 H]-Gly HC1 buffer (20 mM, pH 3.3) 
containing NaCl (300 mM) using an Amicon ultrafiltration unit. The final 
protein concentration was 1 mM. NMR experiments were performed on a 



30 
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Varian Unity INOVA 600 spectrometer equipped with a triple-resonance probe 
with pulsed field gradient. The probe temperature was set at 30* C. HSQC, 
TOCSY-HSQC and NOESY-HSQC spectra were recorded using published 
procedures (Kay et al, 1992; Zhang et al, 1994). NMR spectra were processed 
5 and analyzed using the NMRPipe and NMRView software (Johnson & Blevins, 
1994; Delaglio et al, 1995) on UNIX workstations. Sequence-specific 
resonance assignments were.made using standard procedures (Wuthrich, 1986; 
Clore & Gronenborn, 1991). The assignments for wild-type Fn3 (Baron et al, 
1 992) were confirmed using a ,5 N-labeled protein dissolved in sodium acetate 
10 buffer (50 mM, pH 4.6) at 30°C. 

The three-dimensional structure of Ubi4-K was characterized using this 
heteronuclear NMR spectroscopy method. A high quality spectrum could be 
collected on a 1 mM solution of 15 N-labeled Ubi4 (Figure 17a) at low pH. The 
linewidth of amide peaks of Ubi4-K was similar to that of wild-type Fn3, 
1 5 suggesting that Ubi4-K is monomeric under the conditions used. Complete 
assignments for backbone 'H and l5 N nuclei were achieved using standard 'H, 
15 N double resonance techniques, except for a row of His residues in the N- 
terminal (His) 6 tag. There were a few weak peaks in the HSQC spectrum which 
appeared to originate from a minor species containing the N-terminal Met 
20 residue. Mass spectroscopy analysis showed that a majority of Ubi4-K does not 
contain the N-terminal Met residue. Fig. 17 shows differences in ] HN and 15 N 
chemical shifts between Ubi4-K and wild-type Fn3. Only small differences are 
observed in the chemical shifts, except for those in and near the mutated BC and 
FG loops. These results clearly indicate that Ubi4-K retains the global fold of 
25 Fn3, despite the extensive mutations in the two loops. A few residues in the N- 
terminal region, which is close to the two mutated loops, also exhibit significant 
chemical differences between the two proteins. An HSQC spectrum was also 
recorded on a 50 [iM sample of Ubi4-K in TBS. The spectrum was similar to 
that collected at low pH, indicating that the global conformation of Ubi4 is 
30 maintained between pH 7.5 and 3.3 . 

Example XIX: 
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Stabilization of Fn3 domain by removing unfavorable electrostatic 
interactions on the protein surface 



Introduction , . 

5 Incre asingtoconfonna t ion a lstabaityofapro te mby m u to « 1 on B a m a J or 

ta.erestin^teindesig.Mdbio.eohnolcgy.Tl.etoe^taeosionalstru^res^ 

proteins are stabilized by combination of different types offerees. The 
hydrophobic effect, van der Waals interactions and hydrogen bonds are known to 
contribute to stabilize the folded state of proteins (Kauzmann, W. (!959) Adv. 
,0 Pro, CHen. 14. 1-63; DUK. A. (1990) Ateta** 29, 7133-7155; Pace^C. 
N S birley,B.A.,McNut t ,M. & Ga j iwala,.C(1996)^/i», 75-83). These 
stabilizing forces primarily originate ftom residues that are well packed » a 

theprotein core would induce a rearrangement of adjacent moieties, it rs drfficuh 
15 .oimproveprouins.abfflrybyteeasmgtheseforceswithoutmass.ve 

computation (Malakauskas, S. M. * Mayo, S. L. (1998) Na, Struct Biol 5, 470- 
475) fonpairsbeweenchargedgroupsarecommonlyfoundonmeprotem 
surface (Creighton, T. E. (1993) Proteins: structures and molecular properties. 
Freeman, New York), and an ion pair could be introduced to a protein with small 
,0 structura. perturbations. However, a number of studies have demonstrated that 
the introduction of an attractive electrostatic interaction, such as an ion pan on 
protein surface has small effects on stability (Dao-pin, S, Sauer, U., Nicholson, 
H * Matthews, B. W. 0991) BiocHennstry 30, 7142-7153; SaU, D., Bycroft, M. 
* Fersh., A. R. (1991) X ML Biol 220, 779-788). A large derivation penaUy 
25 andmetaofconformationalentiopyofanunoacidsidechainsopposethe 

favorable electrostatic contribution (Yang, A.-S. * Honi 6 B. (1992) Curr. Opto. 
StruCBiol 2 ,40-45;He O dsch,Z. S. & Tidor, B. (1994) Protein Sci 5,211- 
2 ,6) Recentsmdiesdemor^edmatrepuUiveele^statieinteractionsonme 

protein surface, in contrast, may significantly destabilize a protein, and that ,t rs 
30 possible to improve protein stability by optinuztag surface electrostatic 
interactions (Loladze, V. V., Ibarra-Molero, B., Sanchez-Ruiz, J. M. & 
M^tadze, G. I. (1999) Biochemistry 38, 16419-.6423; Perl, D., Mueller, U, 
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Heinemann, U. & Schmid, F. X. (2000) Nat Struct Biol 7, 380-383; Spector, S., 
Wang, M., Caip, S. A., Robblee, J., Hendsch, Z. S., Fairman, R., Tidor, B. & 
Raleigh, D. P. (2000) Biochemistry 39, 872-879; Grimsley, G. R., Shaw, K. L., 
Fee, L. R, Alston, R. W., Huyghues-Despointes, B. M., Thurlkill, R. L., Scholtz, 
5 J. M. & Pace, C. N. (1 999) Protein Sci 8, 1 843-1 849). In the present 
experiments, the inventor improved protein stability by modifying surface 
electrostatic interactions. 

During the characterization of monobodies it was found that these 
proteins, as well as wild-type FNfnlO, are significantly more stable at low pH 
10 than at neutral pH (Koide, A., Bailey, C. W., Huang, X. & Koide, S. (1998) /. 
Mol Biol. 284, 1 141-1 151). These observations indicate that changes in the 
ionization state of some moieties in FNfnlO modulate the conformational 
stability of the protein, and suggest that it might be possible to enhance the 
conformational stability of FNfnlO at neutral pH by adjusting electrostatic 
15 properties of the protein. Improving the conformational stability of FNfnlO will 
also have practical importance in the use of FNfnlO as a scaffold in 
biotechnology applications. 

Described below are experiments that detailed characterization of the pH 
dependence of FNfnlO stability, identified unfavorable interactions between side 
20 chain carboxyl groups, and improved the conformational stability of FNfiil 0 by 
point mutations on the surface. The results demonstrate that the surface 
electrostatic interactions contribute significantly to protein stability, and that it is 
possible to enhance protein stability by rationally modulating these interactions. 

25 Experimental Procedures 

Protein expression and purification 

The wild-type protein used for the NMR studies contained residues 1-94 
of FNfnlO (residue numbering is according to Figure 2(a) of Koide et al. (Koide, 
A., Bailey, C. W., Huang, X. & Koide, S. (1998) J. Mol. Biol. 284, 1 141-1 151)), 
30 and additional two residues (Met-Gln) at the N-tenninus (these two residues are 
numbered -2 and -1, respectively). The gene coding for the protein was inserted 
in pET3a (Novagen, WI). Eschericha coli BL21 (DE3) transformed with the 
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*L and * - ch,oride (Cambridge Isotopes) as the so,e carbon 
5 » 1141-1151)- Afterha.ves.ingd.eceUsb^.nfcge.theoeUs^.ysed. 

sooiumacetatebufferfcH 5.0), and the protein solution was "»^» ' SP " 

10 e.u.edwitt.asradien.ofaod^cMoride. The protein « -* 

an Amicon concentrator using YM-3 membrane (Miffipore). 

TT.ewad-Weproteinusrf forthesUbiUty measurements conned an 

^terminal mstag (MGSSHHHHHHSSGLVPRGSH) (SEQ ID NO-,14) and 
residues -2-94 of FNfnl 0. The genetorFNS described above was rnsertsd rn 

15 pE T15b (Novagen). Tne protein was expressed and 

0 C„ i d,A.,Bai 1 e y ,C.W.,Huang,X. & Koide,S.(.998),.MoB„ i . W , 

1141-H51). The wild-type protein used for measurements of the pH 
a^denceshowniuH^^con^ArgatoTnrmu^on.wh.c W 

^Asp^.whichisadiacent.Arg^asfoundtobecrinealmmepH 
peLnedusiug thewi,d-<ype, Arg ^background. The genes fortheDTN .d 

25 InseriedinpETlSb. These proteins »« prepared inure same manner as for the 
l-typeplh, .=0,™* proteins for p*. measurements were prepare 
asdelbLov.andmeHs.gmoie^wasnotremovednom^protems. 

Chemical demtwatian measurements 
,0 Proteinsweredissolved^afinaiconcentrauonofSuMmlOmM 

sodium citmtebnffer atvaxious pH containing 100 mM sodium chlonde. 
udine HC1 (GuHCl)-induce unfolding experiments were performed as 
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described previously (Koide, A., Bailey, C. W., Huang, X. & Koide, S. (1998) /. 
Mol. Biol. 284, 1141-1151; Koide, S., Bu, Z., Risal, D., Pham, T.-N., Nakagawa, 
T., Tamura, A. & Engelman, D. M. (1999) Biochemistry 38, 4757-4767). 
GuHCl concentration was determined using an Abbe refractometer (Spectronic 

5 Instruments) as described (Pace, C.N. & Sholtz, J. M. (1997) in Protein 
structure. A practical approach (Creighton, T. E., Ed.) Vol. pp299-321, IRL 
Press, Oxford). Data were analyzed according to the two-state model as 
described (Koide, A, Bailey, C. W., Huang, X. & Koide, S. (1998) J. Mol. Biol. 
284, 1 141-1151; Santoro, M. M. & Bolen, D. W. (1988) Biochemistiy 27, 8063- 

10 8068.). 

TJiermal denaturation measurements 

Proteins were dissolved to a final concentration of 5 uM in 20 mM 
sodium phosphate buffer (pH 7.0) containing 0.1 or 1 M sodium chloride or in 

1 5 20 mM glycine HC1 buffer (pH 2.4) containing 0. 1 or 1 M sodium chloride. 
Additionally 6.3 M urea was included in all solutions to ensure reversibility of 
the thermal denaturation reaction. In the absence of urea it was found that 
denatured FNfhlO adheres to quartz surface, and that the thermal denaturation 
reaction was irreversible. Circular dichroism measurements were performed 

20 using a Model 202 spectrometer equipped with a Peltier temperature controller 
(Aviv Instruments). A cuvette with a 0.5-cm pathlength was used. The 
ellipticity at 227 nm was recorded as the sample temperature was raised at a rate 
of approximately 1 °C per minute. Because of decomposition of urea at high 
temperature, the pH of protein solutions tended to shift upward during an 

25 experiment. The pH of protein solution was measured before and after each 
thermal denaturation measurement to ensure that a shift no more than 0.2 pH 
unit occurred in each measurement. At pH 2.4, two sections of a thermal 
denaturation curve (30-65 °C and 60-95 °C) were acquired from separate 
samples, in order to avoid a large pH shift. The thermal denaturation data were 

30 fit with the standard two-state model (Pace, C. N. & Sholtz, J. M. ( 1 997) in 

Protein structure. A practical approach (Creighton, T. E., Ed.) Vol. pp299-321, 
IRL Press, Oxford): 
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iom-AB.ti-r'r.J-iW.-n+'-WT"-.)] 

where iom b Gibbs ftee energy of unfolding at temperature T, AH„ is the 
enthalpy change upon unfolding a. the midpoint of the transition, T„, and AC . 
5 m eheatcapaci«ychange»ponu^olding. Thevaluefor AC^as fixed a. 1.74 
kcal mol- according to the approximation of Myers et al. (Myers, J. K., 
Pace, C.K. & Scho te , J .M.(.995)^ &i . .,2138-2,48). Mostofthe 
data^^eninutepresenceoflMNaadidnothaveasufficientbasehnefor 

10 u.epresenceoflMNaCltobeidentical^tha.determinedinihepresenceof 
0.1 MNaCl. 

NMR spectroscopy 

NMR experiments were performed at 30 °C on an INOVA 600 
15 spectiometer^arianlnstruments). The C(CO)NH experiment (Grzesrek, S., 
Angler, , SBax, A. (.993), M ag n. Reso, B ,01, 114-U9) andthe 
CBCACOHA experimert (Kay, L. E. (1993) J. Am. CHen, Soc. 115, 2055-2057) 
were collected on a fC. "N]-wUd-«ype FNfh.O sample (1 tnM) dissolved .to 
50 mM sodium acetate buffer (pH 4.6) containing 5 % (v/v) deuterium oxrde, 
20 usingaVanan5nuntripleresonan ce probe wim pmsed field gmdient The 
curboxyl "Cresonanceswere assigned based on the backbone 'H, -C and N 
resonance assignments of FNfhlO (Baron, M., Main, A. L., DnscoU KC, 
Mardon, H. J., Boyd, J. & Campbell, 1. D. (1992) » «. ~73). 
pH titration of carboxy. resonances were performed on a 0.3 mM FNfnl 0 samp* 

(Nanolac Corporation) was used forpH titration. Two-dtaensronal H(C)CO 
spectra were collected using the CBCACOHA pulse sequence as desenbed 
deviously (Mcintosh, L. P., Hand, G., Johnson, P. E., Joshi, M. D„ Koemer, M„ 
30 P,esmak, L .A.,Ziser, L .,Wa k arch«k,W.W. & Wimers,S.G.(1996) 

B^istry 35, 9958-9966). Sample pH was changed by adding small aUouots 
ofhydrocUoricadd.andpHwasmeaswedbeforeandate^gNMRaa^. 
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'H, l5 N-HSQC spectra were taken as described previously (Kay, L. E., Keifer, P. 
& Saarinen, T. (1992) J. Am. Chem. Soc. 114, 10663-10665). NMR data were 
processed using the NMRPipe package (Delaglio, F., Grzesiek, S., Vuister, G. 
W., Zhu, G., Pfeifer, J. & Bax, A. (1995) J. Bioinol. NMR 6, 277-293), and 
5 analyzed using the NMRView software (Johnson, B. A. & Blevins, R. A. (1994) 
J. Biomol. NMR 4, 603-614). 

NMR titration curves of the carboxyl 13 C resonances were fit to the 
Henderson-Hasselbalch equation to determine p£ 0 's: 

8{ P H) = (5 acid + 10^"^) / (1 + 10^-^>) 
10 where 8 is the measured chemical shift, 8 acid is the chemical shift associated with 
the protonated state, 8 base is the chemical shift associated with the deprotonated 
state, and pK a is the pK a value for the residue. Data were also fit to an equation 
with two ionizable groups: 

15 8{pH) = (<W +*ah lO 0 *-^ +5 A lO (2pH - pK «- pK °> ) )/ 
(l + l0^ pH ~ pKal ^ +io (2/,// ~^ flI ~ p/:fl2 ^) 

where 6^, 6^ and 6 A are the chemical shifts associated with the folly 
protonated, singularly protonated and deprotonated states, respectively, and $K al 
and pK a2 are pi^'s associated with the two ionization steps. Data fitting was 
20 performed using the nonlinear least-square regression method in the program 
Igor Pro (WaveMetrix, OR) on a Macintosh computer. 

Results 

pH Dependence ofFNfnlO stability 
25 Previously, it was found that FNfiilO is more stable at acidic pH than at 

neutral pH (Koide, A., Bailey, C. W., Huang, X. & Koide, S. (1998) J. Mol Biol 
284, 1141-1151). In the present experiments, the pH dependence of its stability 
was further characterized. Because of its high stability, FNfhlO could not be 
folly denatured in urea at 30 °C. Thus GuHCl-induced chemical denaturation 
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(Figure^wasused. Thed— on^onwas* — 
energy of unfolding at 4 M GuHC. was used for comparison (Frgure ^ T* 
5 grange. T* pH dependence curve has an apparent — nudpornt 

^fteeenergyon — — onwasno^. Pace. 0 /. 
reporteda^arpHdependenceofthe^a.nefor^ ££H, 

Uorents D. V. & Erickson, R. E. (1992) J« «. 2728 - 2734) - Th 

I.-,- antral oH The results also suggest that by 
r<u or those that destabilize it at neutral pn. inci 

to that found at low pH. 

15 • nfDK 's of the side chain carboxylff'oups in wild-type mjirW 

Determination oJpK. a soj me we , 

Tne pH dependence of FNfniO stability suggests tat an»o acrds 

pK .^areinvolvedinnreobserved^nsiuon. Thecarboxyl groups of Asp 

a!a01«generaUyhavep K >uusran g e(Cr d gh,on,T.E.(1993)« 

its A is shifted * a higher value front its unperturbed va>ue (Yang, U 
fevo rabLte ra ctiousin fll e fol d ei s t ate,ithasalowe I p Xo .-rnus,.hep^ 

involving carboxyl groups. 

Firs, tie -C resonance for the carboxy! carbon of each Asp and Glu 
residueinFN3 W asassigned(Figurel9).Nex,pHti«ionofthe'C 

resume m , (Finite 2<» Titration curves for Asp 

30 resonancesforthesegroupswasperfonnedffigureZO). 

3 67ma 80,andGlu 3 8and47couldbentwenwi m *eHe n derson. 
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9) are either close to or slightly lower than their respective unperturbed values 
(3.8-4.1 for Asp, and 4.1-4.6 for Glu (Kuhlman, B., Luisi, D. L., Young, P. & 
Raleigh, D. P. (1999) Biochemistiy 38, 4896-4903)), indicating that these 
carboxyl groups are involved in neutral or slightly favorable electrostatic 
5 interactions in the folded state. 

Table 9. p2T ff values for Asp and Glu residues in FN3 1 . 

Residue Protein 



Wild-Type D7N D7K 



E9 


3.84, 5.40 2 


4.98 


4.53 


E38 


3.79 


3.87 


3.86 


E47 


3.94 


3.99 


3.99 


D3 


3.66 


3.72 


3.74 


D7 


3.54, 5.54 2 






D23 


3.54, 5.25 2 


3.68 


3.82 


D67 


4.18 


4.17 


4.14 


D80 


3.40 


3.49 


3.48 



standard deviations in the pK a values are less than 0.05 pH units for those fit with a single 



pK 0 and less than 0.15 pH unit for those with two pi^'s. 
20 2 Data for E9, D7 and D23 were fit with a transition curve with two pK a values. 

The titration curves for Asp 7 and 23, and Glu 9 were fit better with the 
Henderson-Hasselbalch equation with two pK a values, and one of the two pK a 
values for each were shifted higher than the respective unperturbed values 

25 (Figure 19B). The titration curves with two apparent pK a values of these 

carboxyl groups may be due to influence of an ionizable group in the vicinity. In 
the three-dimensional structure of FNfhlO (Main, A. L., Harvey, T. S., Baron, 
M., Boyd, J. & Campbell, I. D. (1992) Cell 71, 671-678), Asp 7 and 23, and Glu 
9 form a patch on the surface (Figure 21), with Asp 7 centrally located in the 

30 patch. Thus, it is reasonable to expect that these residues influence each other ! s 
ionization profile. In order to identify which of the three residues have a highly 
upshifted pK a , the H(C)CO spectrum of the protein in 99 % D 2 0 buffer at pH* 
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5.0 



5 



(dire* pH meter reading) was then collected. Asp 23 and Olu 9 showed 
larger deuterium isotope shifts (0.33 and 0.32 ppm, respectively) than Asp 7 
(0 18 ppm). These results show ma. Asp 23 and Glu 9 are protonated to a 
greater degree man Asp 7. Thus, we concluded that Asp 23 and Glu 9 have 
highly upshifted pX„'s, due to strong influence of Asp 7. 



Mutational analysis 

The spatial proximity of Asp 7 and 23, and Glu 9 explains the 

expected to be mostly relieved. Thus, it should be possible to improve the 
sUrb^ofFNmiOa.nentralpH.byremovingmeele^ros.aticrepuls.on 

residues, it was decided to mutate Asp 7. Two mutants, D7N and DTK were 
15 prepared. The former neutralizes the negative charge with a residue of vutudly 
identical size. The latter places a positive charge at residue 7 and increases the 

size of the side chain. 

The 'H "N-HSQC spectra of the two mutant proteins were nearly 
identical to that of me wild-type pro.™, indicating mat tee mutations did no. 
20 cause.ar g estruc OJ ra,per ft .rbations(datano.shown). The degrees of stabihty of 
the mutant proteins were men characterized usmg thermal and chenucal 

denatorationmeasurements. Thermal denaturation measurement were 
perfonnedimtiaUywimlOOmMsodiumchloride.andOMv^awas.nc.uded 

t0 ensure reversible denaturation and to decrease the temperature of the thermal 
25 transition. All the proteins were predominantly folded in 6.3 M urea at room 
temperature. All the proteins underwent a cooperative ttansition, and te two 
mutants were found to be significantly more suable than me wild type atneutral 
PH (Figure 22 and Table 10). Furthermore, these notations almost elinunated 
the pH dependence of the conformation*! stability of FNfnlO. These result 

neutral pH are the primary cause of the pH dependence. 
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Table 10. The midpoint of thermal denaturation (in °C) of wild-type and 
mutant FN3 in the presence of 6.3 M urea. 



Protein 


pH2.4 


pH 7.0 




0.1 MNaCl 


lMNaCl 


0.1 MNaCl 


1 MNaCl 


wild type 


72 


82 


62 


70 


D7N 


68 


82 


69 


80 


D7K 


69 


77 


70 


78 



The error in the midpoints for the 0. 1 M NaCl data is ± 0.5 °C. Because most of the 1M NaCl 
1 0 data did not have a sufficient baseline for the denatured state, the error in the midpoints for these 
data was estimated to be ±2 °C. 

The effect of increased sodium chloride concentration on the 
conformational stability of the wild type and the two mutant proteins was next 

15 investigated. All proteins were more stable in 1 M sodium chloride than in 
0. 1 M sodium chloride (Figure 22). The increase of the sodium chloride 
concentration elevated the T m of the mutant proteins by approximately 10 °C at 
both acidic and neutral pH (Table 10). Remarkably the wild-type protein was 
also equally stabilized at both pH, although it contains unfavorable interactions 

20 among the carboxyl groups at neutral pH but not at acidic pH. 

Chemical denaturation of FNfhl 0 proteins was monitored using 
fluorescence emission from the single Trp residue of FNfhlO (Figure 23). The 
free energies of unfolding at pH 6.0 and 4 M GuHCl were determined to be 
1 . 1 (± 0.3), 1 .7 (± 0.2) and 1 .4 (± 0. 1) kcal/mol for the wild type, D7N and D7K, 

25 respectively, indicating that the two mutations also increased the conformational 
stability against chemical denaturation. 

Determination of the pKJs of the side chain carboxyl gj-oups in the mutant 
proteins 

30 The ionization properties of carboxyl groups in the two mutant proteins 

was investigated. The 2D H(C)CO spectra of the mutant proteins at the high and 
low ends of the pH titration (pH ~7 and -1.5, respectively) were nearly identical 
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.otorespecHv^pectraofthewad^excep.fortolossof.hec^peaks 

for Asp 7 (data not shown). This similarity allowed for an unambrguous 
assignment of resonances of the mutants, based on the assignments for wud-type 
FNfhlO. The pH titration experiments revealed that, except for Glu 9 and Asp 
5 23 tebetaviorsofAspandGtacarboxylg.onpsareveryclosetotherr 

counterparts in the wild-type protein (Figure 24 Panels A, C, D, F and G, and 
Table 9), indicating that the two mutations have marginal effects on the 
electrostatic environments for these carboxylates. In contra* the titration curves 
for E9andD23 show significant changes upon mutation (Figure 24 Panels B and 
10 E) ThepJC„ofD23wasloweredbymorethanl.6andl.4pHunitsinu 1 eD7N 
and D7Kmutants, respectively. These results clearly show that the repulsive 
interaction between D7 and D23 contributes to the increase in pK. of Asp 23 rn 
tire wild-type protein, and that it was eliminated by the neutralization of the 
negative charge at residue 7. The pK„ of Glu 9 was reduced by 0.4 P H nmtby 
15 fll eD7Nmutatior t whUeitwasdecreasedby0.8 P Huni B intheD7 I Cmuta„t 

The greater reduction of Glu 9 pK. by the DTK mutation suggests that there is a 
favorable interaction between Lys 7 and Glu 9 in this mutant protein. 



Discussion 

20 



sion 

The present inventor has identified unfavorable electrostatic interactions 
in FNftlO, and improved its conformational stability by mutations on the protem 

on protein surte significantly destabilize a protein. The results are also 
consistent with recentreports by other groups (Loladze, V. V., Ibarra-Motao, B., 

25 Sanchez-Ruiz, J. M. & Mattatadze, G. I. (1999) tt**-t»> «. 16419-16423; 
Perl, D., Mueller, U, Heinemann, U. & Schmid, F. X. (2000) N* 
380-383; Spector, S., Wang, M., Carp, S. A., Robblee, I., Hendsch, Z. S., 
Fairman, R., Tidor, B. & Raleigh, D. P. (2000) Biochemistry 39, 872-879; 
Grhnsley, G. R., Shaw, K. U Fe*, I. R, AW^ *• W, Huyghues-Despoin.es, 

30 B M.,Thurlffl,R.UScholtz,J.M. & Pace,C.N.(1999)i>ro« i »SdS,1843- 

1 849) in which protein stability was improved by eliminating unfavorable 
electrostatic interactions on the surface. In these studies, candidates for 



80 



WO 03/104418 



PCTYUS03/18030 



mutations were identified by electrostatic calculations (Loladze, V. V., Ibarra- 
Molero, B., Sanchez-Ruiz, J. M. & Makhatadze, G. L (1999) Biochemistiy 38, 
16419-16423; Spector, S., Wang, M., Carp, S. A., Robblee, J., Hendsch, Z. S., 
Fairman, R., Tidor, B. & Raleigh, D. P. (2000) Biochemistiy 39, 872-879; 
5 Grimsley, G. R., Shaw, K. L., Fee, L. R., Alston, R. W., Huyghues-Despointes, 
B. M, Thurlkill, R. L., Scholtz, J. M. & Pace, C. N. (1999) Protein Sci 8, 1843- 
1849) or by sequence comparison of homologous proteins with different stability 
(Perl, D., Mueller, U., Heinemann, U. & Schmid, F. X. (2000) Nat Struct Biol 7, 
380-383). The present strategy using $K a determination using NMR has both 

1 0 advantages and disadvantages over the other strategies. The present method 
directly identifies residues that destabilize a protein. Also it does not depend on 
the availability of the high-resolution structure of the protein of interest. 
Electrostatic calculations may have large errors due to the flexibility of amino 
acid side chains on the surface, and the uncertainty in the dielectric constant on 

15 the protein surface and in the protein interior. For example, in the NMR 
structure of FNfiilO (Main, A. L., Harvey, T. S., Baron, M., Boyd, J. & 
Campbell, L D. (1992) Cell 71, 671-678), the root mean squared deviations 
among 16 model structures for the O 6 atom of Glu residues are 1.2-2.4 A, and 
those for Lys N c atoms are 1.5-3.1 A. Such uncertainties in atom position can 

20 potentially cause large differences in calculation results. On the other hand, the 
present strategy requires the NMR assignments for carboxyl residues, and NMR 
measurements over a wide pH range. Although recent advances in NMR 
spectroscopy have made it straightforward to obtain resonance assignments for a 
small protein, some proteins may not be sufficiently soluble over the desired pH 

25 range. In addition, knowledge of the $K a values of ionizable groups in the 

denatured state is necessary for accurately evaluating contributions of individual 
residues to stability (Yang, A.-S. & Honig, B. (1992) Curr. Opin. St?*uct Biol 2, 
40-45). Kuhlman et al (Kuhlman, B., Luisi, D. L., Young, P. & Raleigh, D. P. 
(1999) Biochemistiy 38, 4896-4903) showed that p^'s of carboxylates in the 

30 denatured state has a considerably large range than those obtained from small 
model compounds. Despite these limitations, the present method is applicable to 
many proteins. 
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The inventor showed that the unfavorable interactions involving to 
M groups 0 fAsp7,Glu9andAsp23wereno longer present rf*ese 
^oupsareprotonatedatlowpHorifAspVwasrepla^wiOtAsnorL.s.The 

Lantyinftenrea^dstabiUtyofthernu^^^ewiM.ypeat.owpH 
5 (Table 10) suggests ftat no other factors significant* contribute to the pH 

pelrbations. The Utile sectoral Ration was expect, since the «M* 
of these three residues are at leas. 50 % exposed to ft. solved based on 

10 L .,„ m e y; T.S.,Baron,M.,Bovd,, & Ca m pbeU,l. D .( 1 992)Ce»;i,671. 
6?8) ' The difference in thermal stability of the wild-type protein between acidic 

0.1 to 1.0 M, the T„ of the wild-type and mutant proteins aU increase^ by ~, U 
•Cwhichisnrftesarnenragnimdea.ftechangehr^offtewud^ebyfte 

p H shift These da* indicate that the unfavorable interactions identified . tins 
L y we«note ff ective,ysltieldedinlMNaaorin4MGt«C,.Becausefte 

V (1992) Cun. Op. Struct. Biol. 2, 35-39). Ofter groups also reportedhttle 

A M.,Clore, G.M.,Fogg, J. H. & Shih,D.T. (1985)^,0/^,49^ «. 
25 Hendsch, Z. S., J— . T., Sauer, R. T. & Tidor, B. (1996) 35, 
7621-7625). Electrostatic interactions are often though, to dhnimsh wtfh 

AK ording ly; ftepresentda te a,ne rt ralpHCrablelO)showingnod I fferencem 

Accor gy V itbemaaals could be interpreted as 

the salt sensitivity between the wua type in 
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cautionary note on concluding the presence and absence of electrostatic 
interactions solely based on salt concentration dependence. 

The carboxyl triad (Asp 7 and 23, and Glu 9) is highly conserved in 
FNfhlO from nine different organisms that were available in the protein 
5 sequence databank at National Center for Biotechnology Information 

(www.ncbi.nlm.nih.gov). In these FNfiilO sequences, Asp 9 is conserved except 
one case where it is replaced with Asn, and Glu 9 is completely conserved. The 
position 23 is either Asp or Glu, preserving the negative charge. As was 
discovered in this study, the interactions among these residues are destabilizing. 

1 0 Thus, their high conservation, despite their negative effects on stability, suggests 
that these residues have functional importance in the biology of fibronectin. In 
the structure of a four-FN3 segment of human fibronectin (Leahy, D. J., Aukhil, 
L & Erickson, H. P. (1996) Cell 84, 155-164), these residues are not directly 
involved in interactions with adjacent domains. Also these residues are located 

15 on the opposite face of FNfhl 0 from the integrin-binding RGD sequence in the 
FG loop (Figure 21). Therefore, it is not clear why these destabilizing residues 
are almost completely conserved in FNfiilO. In contrast, no other FN3 domains 
in human fibronectin contain this carboxyl triad (for a sequence alignment, see 
ref Main, A. L., Harvey, T. S., Baron, M., Boyd, J. & Campbell, I. D. (1992) Cell 

20 71, 671-678). The carboxyl triad of FNfiilO maybe involved in important 
interactions that have not been identified to date. 

Clarke et al. (Clarke, J., Hamill, S. J. & Johnson, C. M. (1997) JMolBiol 
270, 771-778) reported that the stability of the third FN3 of human tenascin 
(TNfri3) increases as pH was decreased from 7 to 5. Although they could not 

25 perform stability measurements below pH 5 due to protein aggregation, the pH 
dependence of TNfii3 resembles that of FNfiilO shown in Figure 18. TNfh3 
does not contain the carboxylate triad at positions 7, 9 and 23 (Leahy, D. J., 
Hendrickson, W. A., Aukhil, L & Erickson, H. P. (1992) Science 258, 987-991), 
indicating that the destabilization of TNfii3 at neutral pH is caused by a different 

30 mechanism from that for FNfiilO. A visual inspection of the TNfii3 structure 
revealed that it has a large number of carboxyl groups, and that Glu 834 and Asp 
850 (numbering according to ref Leahy, D. J., Hendrickson, W. A., Aukhil, I. & 
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Erickson, H. P. (1992) Science 25S, 987-991) forms a cross-strand pair. It will be 
interesting to examine whether altering this pair ean increase the stability of 

in conclusion, a strategy has been described to experimentally .denary 
5 unfavorable electrostatic interactions on the protein surface and improve the 
protein stability by relieving such interactions. The present results have 
demonstrated that forming a repulsive interaction between carboxyl groups 
significantly destabilize a protein. This is in contrast to the small contribute 
offorming a solvent-exposed ion pair. Unfavorable electrostatic interactions on 
,0 aesurfaceseernquitecommonmnaturalprotein, Therefore, optimization of 
the surface electrostatic properties provides a generally applicable strategy for 
increasing protein stability (Loladze, V. V., Ibarra-Molero, B., Sanchez-Ruvz, J. 

M & Makhatadze, G. I. (1999) tteta** «. 1641 W«23; Perl, D, 
Mueller,U.,Hememann,U.&Sctaud,F.X.(2000)WS<r M c ! iiW7 ) 380-383; 

15 Specter, S., Wang, M, Carp, S. A., Robblee, J., Hendsch, Z. S., Fainnan, R, 
Tidor B & Raleigh, D. P. (2000) Biochemisny 39, 872-879; Grimsley, G. R., 
Shaw, K. U Fee, L. R., Alston, R W, Huyghues-Despointes, B. M, Tnurlkul, 
R L Scholtz, I. M. & Pace, C. N. (1999) Art* Sci S, 1843-1849). In 
addition, repulsive interactions between carboxylates can be exptated for 

20 destabilizing undesirable, alternate conformations in protein design ("negahve 
design"). 



An extension < 



EXAMPLE XX 
of the carboxyl-terminus of the monobody scaffold 

25 * The "wild-type protein used for stability measurements is described under 

Example 19. The carboxyl-tenrdnus of the monobody scaffold was extendedby 
four amino acid residues, namely, amino acid residues (Glu-De-Asp-Lys) (SEQ 
ID N0119), which are the ones that immediately follow FNfnlO of human 
fibronectin. The extension was introduced into the FNfnlO gene using standard 
PGR methods. Stability measurements were performed as described under 
Example 19. The free energy of unfolding of the extended protein was 7. 4 kcal 
mol' at P H 6.0 and 30 °C, very close to that of the wild-type protein (7.7 kcal 
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mol* 1 ). These results demonstrate that the C-terminus of the monobody scaffold 
can be extended without decreasing its stability. 

EXAMPLE XXI 
5 Reconstitution of proteins 

Design and Production of Fragments by Cyanogen Bromide cleavage of a mutant 
FNfiilO 

Choice of cyanogen bromide for the cleavage 

To produce fragments of the FNfh 10 protein, a cleavage site was 

1 0 engineered into loop regions. The insertion of a methionine residue flanked by 
two glycine residues on each side presented a cyanogen bromide cleavage site in 
a flexible region. This method had the benefit that the protein could be 
expressed and purified with already existing protocols, and both fragments were 
produced at the same time. Since no methionine residue is present in the wild 

15 type sequence of FNfiilO, this method allowed specific cleavage at the 
introduced site. 

Location of the introduced cleavage site within FNfiilO 

A suitable site for the separation of two fragments of FNfhlO had to be 

20 determined. Both practical aspects of the cleavage and its intended application in 
selection experiments constrained the position of the cleavage site. A cleavage 
site within a more flexible loop region is more likely to result in protein 
reconstitution. With library construction in mind, an ideal split of the protein 
would result in fragments that each contained a portion of the molecule 

25 amendable for the introduction of a library. In the original design, the BC and 
FG loop have been utilized to host restrained peptide libraries, therefore these 
loops should ideally remain uncut (see Figure IB, ID). The DE loop is also a 
potential cleavage site, though its proximity to the FG and the BC loop may 
interfere target binding of BC and/or FG loops. To separate BC and FG loops 

30 into two different fragments, the CD loop or the EF loop region remained as 
possible cleavage sites. 
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The elongation of loop regions introduces a destabilizing effect on the 
protein conformation. In FN&10, the destination effect of modifications in 
the CD loop tended to be much less pronounced than the effect of modifications 
in the EF loop. The cleavage of a peptide bond in a loop region introduces an 
increased degree of freedom to nearby residues compared to loop elongation. 
For restitution, amore stable protein should give an advantage. 
Consequently, it was tested whether a stabilizing mutation far away from the 
cleavage site, such as a surface charge alteration that has been demonstrated 
(Koide et al. 2001), increase the affinity of a reconstitution reaction. To that end, 
10 a total of four mutatio ns for cleavage were constructed (Table 1 1). 
Name Basis Insertion 



15 



AG 0 AG 3M guhci m ^ ca J 



CD 45 


protein 

Wild type 


GGMGA in 


6.4 


1.4 


1.69 






CD- loop 








CD92 


D7KE9Q 


GGMGG in 


7.9 


1.6 


2.10 






CD- loop 








EF45 


Wild type 


GGMGG in 


6.3 


-1.0 


2.43 






EF- loop 








EF92 


D7KE9Q 


GGMGG in 


6.9 


-1.2 


2.69 






EF- loop 









Table 11: Constructed proteins for fragmentation experiments and their free 
energy of unfolding AG. 

20 Either the wild type or the D7KE9Q mutant were used as the template. The 
cleavage site, GGMGG, was inserted either in the CD or EF loop regions (see 
Figure 25). 



25 



Constructs tn nhtain mu tant protein 

Site directed mutagenesis was performed on the FNfnlO gene to obtain 
expression vectors for mutant proteins that contained a GGMGG insertion in 
either the CD loop or the EF loop. The insertion was encoded in 
oligonucleotides (Operon Technologies Inc.), which was used to produce the N- 
terrninal part of the FNfnlO gene by standard PCR. The purified DNA fragment 
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was cut at an existing Ndel site and either at an EcoRI or a Sail restriction site to 
process the gene for the CD or the EF loop insertion respectively. Following the 
digest, the fragments were ligated into a suitably cut parental vector. All 
constructs were confirmed by gene sequencing. Unfortunately, one of the CD 

5 loop insertion mutants contained a glycine to alanine mutation within the 
inserted glycines. Since the purpose of the glycine was solely to provide a 
flexible environment around the methionine residue for more effective cleavage, 
no significant alteration was expected due to this mutation. Nevertheless, only 
the N-terminal fragment of this protein was used in experiments. For the 

10 reconstitution of this protein, the C-terminal fragment of CD 92 was utilized, as 
it was designed to be exactly the same as for the wild type CD loop insertion 
(CD45), including the artificial GG sequence at the beginning instead of GA. 
All the mutant proteins were expressed as soluble protein and subsequently 
purified using metal affinity chromatography, as previously described for the 

1 5 wild type protein (Koide et al. 1 998). 

Residues 1-42 of FN3 were also expressed as a fusion protein, 
His6-ubiquitin-FN3(residue 1-42) by cloning the gene corresponding to the FN3 
fragment in a vector for ubiquitin (Kohno, 1998). This fusion protein was 
expressed and purified as described before (Koide et al. 1998) except that 

20 protein purification was performed in 4M urea. 

Protein cleavage and fragment purification 

Protein was diluted into 0.1M HC1 at protein concentrations of approx. 
2 mg/ml and degassed. Approximately 2-5 mg of cyanogen bromide (CNBr) 

25 was dissolved and the reaction container was sealed under Argon to minimize 
tryptophan oxidation, and incubated for 2h at room temperature. The well- 
established reaction of CNBr cleaving the peptide bond following a methionine 
is shown in Figure 29. The reaction mixture was then passed through a single- 
use reverse phase cartridge (Waters) to remove any remaining CNBr and bound 

30 proteins were recovered by eluting with 0. 1 M HC1 containing 60% Acetonitrile 
(CH 3 CN) and kept on ice. Elution fractions that exhibited significant UV 
absorption were combined and diluted to approx. 25% CH 3 CN. The samples 
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were immediately loaded onto a reverse phase column (Resource RFC, 
Phanr.daAmersha^andfta^en.s were separatedbyaC^CN gradient from 

20% to 45% (see Figure 30). Bluted fractions containing pure fragment were 
immediately frozen to - WC and lyophilized to minimize acid degradafcon. 

5 

>|>/TO spectroscopy 

NMR experiments were performed at 30 °C on an NOVA 600 

Table 1 1) was expressed and purified. A 'H, "N-HSQC spectrum of the 
10 uncleavenprotemwasrecordedmlOmMs^iumphosphatebufferatpH^O 

containing 100 mM sodium chloride and 5 % (v/v) deuterium oxide at a sample 
concentration of 0.95 mM. as described previously (Kay et al. 1992). The 
labeled protein sample was then recovered, cleaved with approximately !0x 
m „lar excess of cyanogen bromide and the resulting fragments punfied. 
, 5 Experiments on the fragments were also performed in 20 mM sodium phosphate 
buffer at pH 6.0 containing 100 mM sodium chloride and 5 % (v/v) deutenum 
oxideon samples a, a concentration of 0.5 mM for the C-termina! fragment and 
0 25 mM for theN-tenninal fragment sample. 10 % glycerol was added to theN- 
terminal fragment sample to prevent aggregation. The two samples for me 
20 recons^dcomplexwerepreparrfbydissolvmgbomCandKternn^ 

in the sample was '^-labeled while the complementary part was not. GuHC 
was men gradual* diluted out by a series of dialyses. Additionally, the samples 
were concentrated and buffer exchanged using a Centricon spin filter (Anucon 
25 ,nc)wifhamolecularweightc«toffat3kDa. Sample concenttation for the 
complex was measured to 0.2 mM of '^-labeled fragment respectively. For 

NMR data were processed using the NMRPipe package (Delaglio et al. 1995), 
and analyzed using the NMRView software (Johnson and Blevins 1994). 
,0 Protein dynamics were probed using a heteronuclear 'H <* steady state 

Nuclear Overhauser Effect (NOE) experiment (Farrow et al. !994). AnNOEis 
observed due to cross relaxation of two spins that are in close proxnmty 
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(Cavanagh et al. 1996). The NOE enhancement of a coupled proton-nitrogen 
pair was measured as a ratio of peak volume while NOE transfer was allowed to 
the peak volume of a control experiment without the saturation of ! H resonances 
(Kayetal. 1989). 

5 

Monitoring reconstitution of the fragments by fluorescence spectroscopy 

To measure the affinity of the interaction, a technique is preferable that 
requires lower sample concentration than NMR. Therefore, the reaction was 
investigated using the inherent fluorescence of the tryptophan residue present in 

10 the N-terminal domain. 

Proteins were dissolved to a final concentration of 500 nM in 20 mM 
sodium phosphate buffer at pH 6.0 containing 100 mM sodium chloride, 750 
mM glycerol unless otherwise noted and various urea concentration between 1 M 
and 2.5 M. Urea concentration was determined using an Abbe refractometer 

15 (Spectronic Instruments) as described (Pace and Sholtz 1997). To obtain data on 
the reconstitution at conditions without the addition of urea and glycerol, a series 
of C-terminal fragment-titration experiments with varying urea or glycerol 
concentrations, respectively, were performed. The dissociation constants in the 
absence of urea or glycerol were estimated by extrapolation based on 

20 experimental data. 



The reconstitution reaction follows the scheme 



N+C 



■>NC 



complex 



where: (1) 




[N]*[C] 

[NC complex ] 



With: (2,3) 
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[N] + [NC complex ]=[N]o 
[C]+[NC complex ]=[C] 0 



the equation results in: (4) 
[N] 



= ^kzIOlK^) + I^^^q7K^ t 4*K D *[N] 0 



where M is the concent^^ 

concentration of complex in solution, M. *e total concentration of N-tenmnal 
f^whichis^^^^ 

CnttClandtCloistreatedsirnilarly. Ibi. relanonship allows *e «*, of 
.periments where the fluorescence F is fitted by: (5) 



titration exp 



10 F andF ..presen.twomoreMngpaxame.ersgiveaby.hestartingaBd 

use of four fitting parameters: [N-tenninal], and F,^, and e 
dissociation — of the .constitution. Even tough approximate values of 

oro er to obtain a best fit Ue fitting result* in parameters dose to me expecte* 
approximate vah.es and resecting parameters to me expect values iead only 
t„ minor changes in the observed dissociation constant 

Tnelinear dependency of thefree energy of unfolding of many protetns 

energy of the reconstitution was assumed to depend linearly on urea 
concentration asweU. As me free energy of the reconstitution depends 
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exponentially on the dissociation constant, the dependence of the K D on the urea 
concentration required a linear fit of the logarithm of the K D (see Figure 39). 
Glycerol concentration was kept constant at 750 mM in this set of experiments. 
For the glycerol concentration dependence, no precedent has been established in 
5 the literature. The dissociation constant was therefore assumed to depend 
linearly on glycerol concentration (see Figure 40). A logarithmic fit similar to 
the urea concentration dependence did not represent the data well. However, the 
dependence of the dissociation constant on glycerol was not very strong, and 
thus similar values were obtained even if other dependencies were assumed. 

10 

Circular Dichroism 

The far UV-circular dichroism spectrum of C-terminal fragment was 
recorded at concentrations of 5 \iM to 100 \iM in 20 mM sodium phosphate 
buffer (pH 6.0) containing 100 mM sodium chloride. A sample at 1.5 \iM 

1 5 fragment concentration was investigated to test for the presence of secondary 
structure in buffer conditions of the fluorescence experiments (sodium phosphate 
buffer with 1M urea and 750 mM glycerol). The N-terminal fragment was 
measured at a fragment concentration of 5 pM in 20 mM sodium phosphate 
buffer (pH 6.0) containing 100 mM sodium chloride and 750 mM glycerol. 

20 Circular dichroism measurements were performed using a Model 202 

spectrometer equipped with a Peltier temperature controller (Aviv Instruments) 
using a 1 cm pathlength. 

The temperature dependence of the secondary structure in the C-terminal 
fragment was investigated as well. The maximum of the inflection in the 

25 spectrum at low temperature at 232 nm (see Figure 41) was recorded as the 
sample temperature was raised at a rate of approximately 1 °C per minute. The 
thermal denaturation data were fitted with the standard two-state model (Pace 
and Sholtz 1997): 

AG(T) = T/T m )- &C P [(T m - T) + T\n(T/T m )] 

30 where AG(T) is the Gibbs free energy of unfolding at temperature T, AB^, is the 
enthalpy change upon unfolding at the midpoint of the transition, T m , and AC p is 
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the heat capacity change upon unfolding. A aC p was approximated to MM tod 
mol" K", based on Myers et al. (Myers et al. 1995) and kept constant for all 

measurements. 

5 Results 

DgaaantMnceteffllil f pnrlrh- ^ e ate - 

The stability of the mutant proteins was investigated before cleavage 
using guanidine hydrochloride induced unfolding and refolding reactions 
monitored by tryptophan fluorescence. The fluorescence was quenched » the 
,0 folded protein and allowed a convenient way to measure the unfolding u—. 
The unfolding curves of all four proteins are shown in Figures 26A and 26B. 
The standard wo state transition was assumed in the analysis and resulting 
parameters ofthe fitting are given in Table 11. The difference in stabthty 
between CD loop and EF loop elongation mutant was observed for these mutants 
,5 featuringafive-residueinsertion. Theeffec. ofthe altered surface charge 
D7KE9Q at the N-terminal end was difficult to judge from mis measurement, 
Seethe vrnfoldmgreactioninGnHC. was shown least sensitive to themutation 
w ithodyamargmalmcr^ems^(seeFig™26B).Formeunfoldingof 

,ne proteins with a cleavage site, no stabilizing effect of tire D7KE9Q mutations 
20 within errorwas seen. The additional methionine had no unexpected effect m 
addition to what was seen in the four glycine insertion. 

c T .^n„ of- and C-frmin al fragments. 

The peptide bond following a methionine residue was cleaved under 
25 acidicconditionsusingcyanogenbromide. Initial tests exhibited successful, 
^.incomplete cleavage of all theproteins a, mildly acidic conditions. Further 
optimization for the preparation was found at more acidic reaction and 
purification conditions and a strict limitation of cleavage time to mmnrnze 
deamidationunder the acidic condition, A typica! reverse phase chromatogram 
30 fortheCDloopcleavageisshowninFigare28. The- and C-terminal fragments 
identity was confirmed by mass spectroscopic analysis. Although there rs no 
methionine in the wild type FNfhlO sequence, there is a secondary cleavage stte 
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at the start of the protein, separating a multiple histidine (HisTag) leader 
sequence from the N-terminal fragment. The N-terminal fragment that has the 
HisTag still attached was shown to run as the contaminant peak number 2. 
Contaminant peak number 3 was shown to be of slightly smaller mass than the 
5 N-terminal fragment. Its volume increased with prolonged exposure to an acidic 
environment regardless of CNBr presence, and likely resulted from a 
deamidation event. Contaminant peak number 1 appeared to include uncleaved 
FNfhlO as well as both fragments. Its volume was sensitive to the exact loading 
conditions, where moderate amount of acetonitrile present in the loading buffer 
10 decreased the peak volume. Most likely, reconstitution of the fragments was 
taking place even at acidic conditions and complexed fragments resulted in peak 
number 1. 

Initial tests revealed different fragment characteristics between cut sites 
15 Trials to observe reconstitution using fluorescence were obstructed by the 

presence of nonspecific adhesion, however, qualitative data could be obtained. 
The Trp fluorescence of FNfiilO was highly quenched in the folded state. If the 
reconstituted complex formed the same three-dimensional structure, quenching 
of the fluorescence signal was expected upon a reconstitution reaction. Both N- 
20 terminal fragments produced by a cleavage in the CD loop, with or without the 
surface charge mutations, revealed such quenching upon the addition of the C- 
terminal fragment, indicating that reconstitution had occurred. The fragments 
produced by a cleavage in the EF loop, however, showed no indication of 
reconstitution. The wild-type N-terminal fragment from CD45 (see Table 1 1) 
25 exhibited a poorer solubility compared to the D7KE9Q counterpart, CD92. The 
mutant CD-loop fragments were chosen for a detailed study of the reconstitution 
reaction. 

Size exclusion chromatography confirmed reconstitution of CD cut fragments 
30 Size exclusion chromatography was performed on the fragments of CD92 

using a Sephadex75 gel filtration column. When a mixture of - and C-terminal 
fragment at 5 |iM concentration were loaded onto the gel filtration column (see 
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, • „wak at the retention time of uncut 

^nude.^creson.nce^sp 

foment. Theexactconfonnatoonofaprotem 
15 ofanuclearspin. otoscopy aUows the 

, u ;,„l ,hift of nearby atoms, gmngnseto adisrmc p 
, specific for a cordon of a protect « 

^p.esduetospecrfic^act.on, tottoisolafe n for residues 
e^^eiwmreflecchanses^ o* ^^-^ 
^vedtathosespecfficurteracuo.. Kafra^ fa 

^s^ofaUatomsare — toloMl 

• fi , aTld -mmR analysis was performed, 
protein was punfied andttMK ftheuncutpro tein exhibited a peak 
^•W HSQC spectrum of the uncut pwciu 

TheH-N^ P fol0formostam ides( S eeFigure30). 

there were additional peaks that mceiyon&iu 
As expected, there w eak position for residues m the 

residues, and there were changes in me p 
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immediate vicinity of the surface charge mutations D7KE9Q. Nevertheless, the 
changes in chemical shift were limited to structurally adjacent residues, e.g., 
D23, which were likely to be affected. Thus, the p- sandwich fold of FNfhlO 
was maintained in the protein featuring both the D7KE9Q and the cleavage site 

5 insertion mutation. 

The spectrum of each fragment by itself showed limited peak dispersion 
indicative of an unstructured peptide (see Figure 31). For the C-terminal 
fragment, additional peaks appeared that stem from a reversible transition to an 
oligomeric state. When reducing the temperature to 5°C, the population of this 

1 0 alternative conformation was reduced, resulting in a spectrum indicative of an 
unfolded peptide. 

The N-terminal fragment in isolation was not soluble at high protein 
concentrations and required the presence of 10% glycerol to record a spectrum 
(see Figure 31). Though peak broadening indicates formation of larger 

1 5 aggregates, the spectrum did not exhibit a significant spread of chemical shifts, 
suggesting that the aggregate conformers were unstructured. 

Once - and C- terminal fragments were combined, the tendency of the N- 
terminal fragment to aggregate decreased significantly, allowing higher 
concentrations of fragment. This indicated that a more folded complex was 

20 formed. An HSQC- spectrum on a complex formed of labeled N-terminal 

fragment and unlabeled C-terminal fragment exhibited a drastic change to a well- 
dispersed distribution of peaks (see Figure 35A, B). 

Similarly, addition of unlabeled N-terminal to labeled C-terminal sample 
revealed a conformational change to a well-dispersed spread of chemical shifts. 

25 The overlap of these two spectra, equivalent to the spectrum of a fully labeled 
complex, was virtually identical to the previously recorded spectrum of the uncut 
protein (see Figure 36A, B). Therefore, the reconstitution of the two fragments 
resulted in the formation of a complex that had the same fold as the original 
protein. 

30 



95 



WO 03/104418 



PCT7US03/18030 



The next question investigated was whether the formation of a complex 
with similar fold as the original protein would result in a more dynamic 
assembly The association could lead to a more loose assembly, where regions 

5 couldexHbitmotiononmuch^^ 

Steady-statef'HH 1 ^}- NOE measurements yield information on fast, 
picosecond to nanosecond time scale dynamics of a molecule. For a qualitative 
judgement of the overall changes in dynamics, a full assignment of the 
resonances is not necessary, as it is not important to identify particular residues 

10 atthispoint. Thus^eNOEexperimentwasanalyzedforeachpeakfoundm 
the investigated spectra (see Figures 35 and 36). A tentative assignment based on 
the similarity of the complex spectrum to the known assignment of wild type 

FNfnlO was shown as well. 

For both N-terminal and C-terminal fragment in the complex the 15 N- 
15 NOE signal was predominantly above 0.75, indicating arigid assembly 

comparable to the uncut protein in this motion regime, m contrast isolated 
C-tenninal fragment showed significantly lower values for the majority of 
resonances, characteristic of a flexible peptide in arandom coil conformation. 
The lack offastdynamicmotion further showed that the reconstitution resulted 
20 in a fragment complex that had very similar characteristics comparable to the 
uncleaved protein. 

Determ^^ 

ni r »ml and iirealinv* ™n-spftrific binding 

Initial tests had already confirmed that the reconstituted complex 
exhibited similar quenching of the signal as the uncut protein, which was to be 
expected after NMR experiments had confirmed that the same three-dimensional 
structure was formed. When the N-terrninal fragment containing the fluorophore 
was studied in isolation, the observed fluorescence appeared inconsistent 
30 Further investigation revealed that the fluorescence of the N-terminal fragment 
decreased over time when the sample was kept in the quartz cuvette for the 
measurement(seeFigure37).This was due to adherence ofme fragment to the 

96 



25 



WO 03/104418 



PCT7US03/18030 



cuvette walls. The effect was also observed on plastic surfaces of storage tubes. 
It was most prevalent at the low concentration used in the fluorescence 
experiments. The adherence was found to be on a slower time scale, not coming 
to an equilibrium within minutes and also was found to be reversible on a slow 

5 time scale. The exponential decay of fluorescence signal interfered with the 
detection of reconstitution as quenching of the reconstitution and signal loss 
were indistinguishable. A series of different sample buffer conditions were 
tested. It was found that the addition of glycerol and denaturing co-solutes such 
as urea or guanidine hydrochloride decreased the magnitude of adherence. At 

10 concentration of 750 mM glycerol and 1M urea or higher, the fluorescence signal 
was found to be nearly constant over time. 

Unusually high affinity of the reconstitution 

A dissociation constant can be determined from a titration experiment 

1 5 where the concentration of the fluorophore is held constant and its corresponding 
binding partner is added. At a glycerol concentration of 750 mM, 1M urea and 
the N-terminal fragment of approximately 500 nM, the titration of the C-terminal 
fragment was fitted to approximately 10 nM (see Figure 38). However a K D of 
0. 1 nM resulted in a nearly identical fit, indicating that the dissociation constant 

20 lies outside the accurately assessable range. When measuring the dissociation 
constant, the concentration of the fluorophore has to be lower than or near the 
dissociation constant. To obtain a more accurate value, a series of titration 
experiments were performed at higher urea concentration, followed by 
subsequent extrapolation to compensate for the addition of urea. The resulting 

25 dissociation constants over the concentration of Urea present are shown in Figure 
39. The line in Figure 39 indicates that the detection limit of this method was set 
at 10 nM, which is 50x lower than the concentration of fluorophore. Accurate 
determination of the dissociation constant was no longer possible at or below 
this limit. As indicated by the linear fit, an extrapolation to the absence of urea 

30 was made with reasonable accuracy. The dissociation constant in the absence of 
urea, but in the presence of 750 mM glycerol, was estimated to be 1 .5 nM. 
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Indication of an oligomeric structure in the C-terminal fragment 

In addition to the observation of secondary signals at higher temperature 
in the NMR experiments, the circular dichroism spectrum of the C-terminal 
fragment showed evidence of secondary structure. The spectrum showed an 
5 inflection in the far UV regime at 230-235 run (see Figure 41), which has been 
associated with P-hairpin structures. 

As shown in Figure 42, B-turn structures exhibited a cooperative 
temperature dependence, with a melting curve of around 37°C for a protein 
concentration of 50 |iM. The phenomenon was concentration dependent, which 

10 was a clear indication that it involved an oligomerization process, not an 

intramolecular folding reaction. Both the midpoint of the melting curve as well 
as the cooperativity of the reaction changed when the concentration was varied. 
At a concentration 1.5 \iM C-terminal fragment in the presence of 1M urea and 
750 mM glycerol the temperature increase resulted in a dependency that could 

15 only be fitted if the same baseline slope seen at higher concentration was 
assumed (see Figure 42). The higher baseline matched that of the higher 
concentration measurements, while the lower one was outside the temperature 
range, given the low cooperativity of the transition. Nevertheless, it was possible 
to distinguish a transition even at this low concentration. 

20 

Discussion 

The reconstitution of the FNfiilO fragments generated from a cleavage in 
the CD loop could be observed and the formation of the original structure was 
demonstrated utilizing fluorescence, gel filtration and NMR experiments. The 

25 NMR spectra indicated that the structure of the FNfiil 0 domain was 

reestablished. The NMR data also demonstrate that the whole complex was as 
rigid as the uncut protein. 

The dissociation constant of the reconstitution was determined to be 3.6 
nM using fluorescence spectroscopy. Fragments of a number of proteins have 

30 been reported to reconstitute structure and function. However, only a few reports 
the dissociation constants of the reconstitution reaction. The of the 
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^constitution of the CD92 protein is one of the lowest repotted vah.es .0 date 

(Table 12). 

protein v i n M\ Number of Comment 



10 



15 



FNfhlO 

Chymotrypsin 
Inhibitor-2 
(wild type) 

Ubiquitin 

Protein G Bl 
domain 

Barnase 

S-protein/S- 
peptide 



3.6 42 + 47 

40 40 + 24 

38 000 35 + 40 

10 000 40+15 



600 



36 + 73 



599 (wild 20+104 
type) 



20 



S-protein/LB2 5.4 (best 20+104 

variant S- selected) 
peptide 

CalbindinD9K 0.003 43 + 31 
EF-hands 



(our data) 
(Ladurner et al. 
1997) 

(Jourdan and 
Searle2000) 

(Honda etal. 
1999) 

(Sancho and 
Fersht 1992) 

Large unit (Dwyeretal. 
folds 2001) 
independent 

Improvement (Dwyeretal. 
by phase 2001) 
display 

Ca 2+ (Berggard et al. 

dependent, 2001) 
Fragments 
fold 

independently . 



25 



30 



fv roiiiM for reconstitution reaction reported in the 
Table 12: Companson of K D values tor recoiuauu 

literature. 

Oniy one ease (Berggard et ai. 2001) reported a tower vatae, though in tot 
particular case, both fragments folded independently and the Or* binding was 
essential for ure reconsntation. Me^ binding has been brown to stabilize the 
tee dimensional sh.cU.re of proteins [Savchenko, 2002 M248] (Ue e, al. 
19 89) [Pabo, 2001 #1186](U 2001 #1251], which likely applies to a 
institution as well. This might impede direct comparison to ins 
rec.nstitotionreactionwithn.eomers.TheMghaffini^offl.eFNmlO 
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fragments compared to other reconstitution reactions is consistent with a 
correlation to a high stability of the parental protein. 

Fragment reconstitution has been reported for a number of proteins 
indicating that the phenomenon is not an extraordinary characteristic (de Prat 
5 Gay and Fersht 1994; Kippen et al 1994; Tasayco and Chao 1995; Ladumer et 
al. 1997; Pelletier et al. 1998; Tasayco et al. 2000; Berggard et al. 2001). A 
similar reaction could potentially be found for any protein because the driving 
force to form the particular three-dimensional structure is generally independent 
of the maintenance of a single peptide bond. Cyclic permutations of proteins 

1 0 confirm that if two amino acids are in proximity to each other within a fold, the 
addition of a bond is generally possible as well (Zhang, 1993) (Hennecke, 1999), 
as long as important folding elements stay intact. However, not every peptide 
bond is expendable, and removing more than one at a time may not be possible. 
Each peptide bond must carry some information on the three-dimensional. 

1 5 Obviously, the importance of the information contained in any one peptide bond 
varies within a protein, which gives rise to differences seen in the capability of 
fragments to reconstitute a protein between two cleavage sites. 

Folding of a protein, and therefore reconstitution of a protein from 
fragments, is primarily driven by the burial of hydrophobic surface away from 

20 water. If the total surface burial upon folding from disordered peptides were 
responsible, two cleavage sites would not result in different affinities as the 
same protein is folded. If the interaction between fragments were needed to 
maintain a complex, then the burial in the interface between the fragments is 
more relevant. As an approximation, the amount of newly exposed surface in 

25 the interface upon separation of the fragments was calculated using the Connolly 
algorithm (Connolly, 1983) in the program GRASP (Nicholls et al. 1991). 
1930 A 2 were exposed upon cleavage in the CD loop, while 1 1 81 A 2 were 
exposed upon cleavage in the EF loop based on the crystal structure of FNfiilO 
(Dickenson et al. 1994). The surface area found for both cleavage sites were 

30 comparable to binding interfaces with reasonable affinity. If the buried interface 
was the only determining factor, both cut sites would produce fragments that 
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reconstitute readily, and less of a difference in affinity would be expected 
between the differently cut fragments. 

Most cleavage sites reported were in a flexible region of mereconshtuted 
protein A peptide bond necessitated the proximity of two residues, winch 

acornplexoftra^entsn.ssmgtruspepndebond.TT.eregionsurroundingte 
cleavage site of a reconstituted complex therefore exhibits an increased 
flexibility compared to the uncut protein. An entropic penalty apphes to a 
proteinifaregionismutatedtobemoreflexible. A region that is already 
,0 flexiblesrffersasmaUerpenaltyuponcleavage. The highest possible statahty 
for a complex is achieved if a protein is cleaved in a flexible region. 

The significant decrease in stability in the EF loop elongation mutant 

M-tollv i».rfi-»taii*'a ta ' to *" , * hMi * 

inability of the EF loop fragments to reconstitute compared to the CD loop 
15 fragments was correlated with the significantly lower stability of the EF loop 
elongation mutant. This indicated that for a moderate to high affinrty, the 
detennimngfactorwasurestabiUtyofmeparentaiprotein. Asmecleaved 
proteins had a significant elongation inserted at me cleavage site, the sfcbthty for 
meseproteinshadalreadysufferedanentropicpenalty. TheEFloophad 

20 suffaedarraggravatedpenaltyduetomeW 

the FNfnlO fold. The data suggest that the stability of an insertion — can be 
utilized to predict if reconstitution is possible. 

The presence of distinct resonances in the HSQC, whose chemical shtfts 
were far torn random coil values, was evidence for an oligomerizattonofmeC- 

25 termina. fragment Additionally, the oligomers cause a concentration dependent 
inflection in the CD spectra indicative of secondary structure, which was not 
completely vanished at 1.5 uM C-terminal fragment concentrahon. The 
oligomeric structure monitored by CD exhibits a clear cooperative temperature 

experiments. However, the detected presence of distinct peaks in the HSQC and 
the solubility to more than 1 mM concentration suggested the absence of large 
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insoluble aggregates. More likely much smaller oligomers containing a limited 
number of molecules existed under these conditions. Formation of larger 
oligomers resulted in increased linewidth in the NMR, similar to what was seen 
for the N-teiminal fragment. Oligomerization of the N-terminal domain caused 
5 significant line broadening compared to the similarly sized C-terminal domain, 
indicating much larger oligomers. However, the N-terminal fragment did not 
exhibit a more dispersed spectrum at the same time. Thus, the comparably large 
oligomers were not in a distinct structure. The C-terminal fragment exhibited 
characteristics in the CD and the NMR data, indicating formation of small 

10 oligomers with a distinct structure, most likely a p-sheet conformation, that were 
present even at low micromolar concentrations. 

A consequence of an oligomer formation was its competition with the 
reconstitution reaction. One possibility to influence the reconstitution arises if 
the dissociation of an oligomer is rate-limiting for the formation of the 

15 reconstituted complex of- and C-terminal fragments. The reconstitution 

complex forms rapidly, judging from the exceptionally high affinity measured. 
The reaction is slowed by a possible dissociation of oligomer that has occur prior 
to reconstitution. Consistent with such a competing oligomerization reaction are 
observations made in the fluorescence experiments, that equilibration of the 

20 complex formation is slower than expected for the measured high affinity. 
Additional indication was obtained while mixing highly concentrated samples 
for the NMR experiments. Even longer equilibration times and consequently 
careful sample preparation were necessary to attain reconstituted complex at high 
fragment concentration, possibly reflecting an increased population of C- 

25 terminal fragment in the oligomeric state unavailable for the reconstitution 

reaction. The dissociation of the oligomeric structure of the C-terminal domain 
is therefore likely to be rate limiting for the formation of the reconstituted 
complex. 
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TV ~^Htntion mrtiir — ^ m viv0 - 

Evidence on the possibilities to apply the observed restitution of 
FNfhlOtoayeast two hybrid selection suggested that to reaction occurs ,n 
— A yeas, two-hybrid selection was previously applied to isolate FNfhlO 

[Koide 2002 #1160]. Selected proteins, termed •monobcdies' were isolated ur 
aligand specific manner. Utilizing binding proteins torn this selection, yeas, 
hybrid assays were connoted to test if monobody fragments reconstitute 
into a FNfhlO fold* vivo. Fragments that featured either a wild type or a 

proteins (see Figure 44). An FNfhlO reconstituted specifically in vivo. The 
results confirmed that all fragments with a mutant FG loop reconstituted nearly 
as well as the wild type fragments, indicating that the FG loop does not 
contribute significantly to FNfhlO stability. 
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EXAMPLE XXII 
Reeonstttution of monobodies in yeast cells 
This example demonstrates that the fragment reconstitute reaction of 
FN3 has sufficiently high affinrty and specificity as means to heterodimerrze 
20 proteins of interest The results in mis example also strongly suggest that yeast 
two-hybrid UbrariesbasedonFNSftagmentreeonsrimtioncanbeconstructedfor 

large scale screening. 
Strains and media 

Yeast strains EGYAS,MATa hisS trpl ura3 lei^LexAop-LEW , and 
25 m MATa h isSA200leu2-Slys2A201 *pl*:*G uraS-52, have been 

Origene. Yeast was grown in YPD media or YC dropout media followmg 
instructions from C*gene a*d mvitro^^ 
according to Sambrook et al (Sambrook et al. 1989). 



30 



Constructions of plasmids for the yeast two-hybrid screening and 
monobody libraries 
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The plasmids for monobody-reconstitution were constructed as follows. 
The plasmid pFNB42, that encodes FLAG tag-FNftilO-NLS (nuclear localization 
signal) -B42 fusion protein, was constructed by PCR. The oligonucleotides used 
for the construction of the plasmids for monobody-reconstitution are found in 
5 Table 13. 



Table 13: The oligonucleotides used for the construction of the plasmids for 
monobody reconstitution 



Name 


DNA sequence 


FNABCGGKpnR 


ACCACCGGTACCACCACCGTTACCACCGGTTT 




CACC (SEQ ED NO: 125) 


FNDGGBamF 


CGGGGATCCAAGGTGGTGGCTCCCCGTTCAGG 




AATTC (SEQ ED NO: 126) 


NcoFLAGFNF 


CATGCCATGGACTACAAGGACGACGATGACA 




AGGGTATGCAGGTTTCTGATGTTC (SEQ ED 




NO: 127) 


KpnGGTGGSNLSF 


GTGGTACCGGTGGTTCCCCTCCAAAAAAGAA 




GAGAAG (SEQ ID NO: 128) 


FNKpnGGTGGSR 


GGAACCACCGGTACCACCGGTACGGTAGTTA 




ATCGAG (SEQ ID NO:129) 


B42TAAXhoR 


CCGACTCGAGTTAATCTCCACTCAGCAAGAG 




(SEQEDNO:131) 


T7F 


TAATACGACTCACTATAGGG (SEQ ED NO: 1 30) 


FN5R 


CGGGATCCTCGAGTTACTAGGTACGGTAGTTA 




ATCGA (SEQ ED NO: 132) " 



20 The oligonucleotides NcoFLAGFNF and KpnGGTGGSR were used to amplify 
FNfhlO gene from pAS45 (Koide et al. 1998), and the oligonucleotides 
KpnGGTGGSNLSF and B42TAAXhoR were used to amplify NLS-B42 gene 
from pYesTrp2 (Invitrogen). The two PCR fragments were annealed and 
extended using PCR, then digested withNcol and XhoL The fragment was 

25 ligated in pYesTrp2 that was digested with the same restriction enzymes. The 
FNfhlO gene of pFNB42 was replaced with the gene of the N-terminal fragment 
of FN&10 (the ABC-strands of FNfiilO) to construct the plasmid pFNABCB42, 
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10 



that encodes FLAG tag-N terminal fragment of FNthlO (ABC strands)-NLS 
(nuclear localization signal) -B42 fusion protein, using the restriction enzyme 
Ncol and Kpnl. The gene of FNfhlO ABC strands was amplified from pFNB42 
or vectors of ERaEF-binding monobodies whose AB-loop had been mutated, 
using oligonucleotides T7P and FNABCGGKpnR. For the construction of 
pEGFNDEFG, that encodes LexA-C terminal fragment of FNfhlO (DEFG 
strands) fusion protein, was constructed by cloning a gene of FNfnlO DEFG 
strands in P EG202 (Origene) using restriction enzymes BamHI and Xhol. The 
gene of FNmlO DEFG strands was amplified from pAS45 or the vector that 
encodes yeast ORF-binding monobody whose FG-loop has been mutated, using 
oligonucleotides FNDGGBamF and FN5R. 



15 



P-Galactosidase assay for monobody-reconstitution 

The yeast strain EGY48 was transformed with a derivative of the 
pFNABCB42 plasmid encoding a fusion of N terminal fragment of particular 
monobody-NLS-B42. The yeast strain RFY206 that has the plasmid pSH18-34 
was transformed with a derivative of me pEGFNDEFG plasmid encoding a 
fusion of LexA-C terminal fragment of particular monobody. The EGY48 
strains and the RFY206 strains were mated, replicated onto YC Gal Raf -his 
20 -ura -trp plate, then the (B-galactosidase activity of the mated strains was 
measured by agarose overlay method (Duttweiler 1996). 



Results 

EGY48 strains harboring a derivative of pFNABCB42, that encodes the 
25 N terminal fragment of FNfnlO-NLS-B42 fusion protein, were mated with 
RFY206 strains harboring p-galactosidase reporter plasmid and a derivative of 
pEGFNDEFG, that encodes LexA-C terminal fragment of monobody fusion 
protein. The mated strains were tested for p-galactosidase activity, and the 
results are shown in Figure 27. The amino acid sequence of the FG loop region 
30 of the C terminal half of monobodies are listed on Table 14. The results show 
that not only FNfnlO, but also monobodies can be reconstituted in vivo. 
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Table 14: The sequence of the FG-loop regions of the C-terminal half of 



monobodies 



clone name 



amino acid sequence of the FG loop 



pEGFNDEFG 
5 pEGFNDEFG0319 
pEGFNDEFG4699 



VTGRGDSPASSKP (SEQ ED NO: 133) 
VTGQ WALYLS SKP (SEQ ID NO: 134) 
VTGGEVRCVRDAASWSSWLKP (SEQ ID 
NO: 135) 



EXAMPLE XXIII 

Examples of mutations to be introduced to alter the association specificity of 



The inventor previously demonstrated that charged residues on the 
surface of FN3 have large effects on the stability of FN3 (Koide et al., 2001). 
Mutations of residues on the protein surface cause small perturbations on the 
overall structure of a protein. Also interactions between residues at "cross 

15 strand" positions (i.e., residues on neighboring beta-strands that are directly 
adjacent to each other) are known to influence the beta-sheet stability (Smith & 
Regan, 1995). Control of peptide association using charged surface residues has 
been well documented, particularly for coiled coil peptides (see Oakley and Kim, 
and references therein). Therefore, such mutations are used to modulate the 

20 affinity between N-FN3 and C-FN3. Below is a general strategy for using N- 
FN3 and C-FN3 that are separated in the CD loop. Note, however, this strategy 
is applicable to FN3 fragments that are separated at other points beside in the CD 
loop. 



25 FN3 (see Figure 45). These two strands belong to different fragments. 

Mutations such as D21 (or E21) and D56 (or E56) cause electrostatic repulsion 
between negative charges on strand B and strand E, thus destabilizing the 
complex of N-FN3 and C-FN3. Similarly, R21 (or K21) and R56 (or K56) cause 
repulsion between positive charges on strands B and E, thus destabilizing the 

30 complex. In contrast, whenN-FN3 with D21 (or E21) and C-FN3 with R56 (or 
K56) are combined, the electrostatic repulsion is eliminated, and the two 



10 



N-FN3 and C-FN3 



Strands B and E are aligned in the anti-parallel manner in one sheet of 
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foments formate comply Likewise, a combination of R21 (orK21)and 
E56 (or E56) also facilitate the association. Ttae "cross strand" positions that 
introduce such mutations include residues 19 and 58, 17 and 60, and 23 and 54 
on strands B and * residues 37 and 45, 35 and 47, 33 and 49 and 31 and 51 on 
5 strands C and D; residues 37 and 69, 35 and 71, 33 and 73, and 31and 75 on 
strands C and F. Mutations at these positions are combined to adjust the afiuuty 

and specificity of association. 

For a combination of N-FN3 and C-FN3 with a different separation part, 

cross strand pairs are identified using the same principle. 
10 The second class of mutations that can be used to alter the specificrty of 

FN3 fragment reconstitution is those in the core of FN3. The core of a protein is 
generally tightly p^^ 

(Matthews 1993). Thus, multiple mutations can be introduced in the core (for 
example, positions 10, 20, 36, 70 and 90 canbe simultaneously mutated). 

need to introduce multiple mutations to achieve tight interaction of fragments. 
Core mutations and surface mutations canbe used in combination, which should 
provide ahigh degree of interaction specificity. 
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EXAMPLE XXIV 
Procedures for library construction and screening using "split FN3» 



"Nomenclature 

N-FN3 and C-FN3 denote N-terminal and dentinal fragments of FN3 
25 that are produced by separating FN3 at a position within aloop. For example, if 
to separation is within the CD loop, N-FM3 contains the A, B and C strands, the 
AB and BC loops and a section of the CD loop. C-FN3 then contains the 
remaning section of the CD loop, the DE, EF, and FG loops and the D, E, F, and 
G strands. 

30 A binding pair denotes a pair of molecules that associate with each other, 

having a dissociation constant of less than 10" 5 M' 1 . Binding pairs canbe used to 
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augment the association (reconstitution) of N-FN3 and C-FN3. One example of 
a binding pair is coiled coils. 

L, Phage display 
5 a. Two vector system 

N-FN3 is fused to a phage coat protein in such a way that it is displayed 

on the surface of bacteriophage (Kay et al., 1996; Koide et al., 1998). 

Alternatively, C-FN3 may be fused to a phage coat protein such as pin and 

pVHL An N-terminal secretion sequence is added to the complementary 
10 fragment (the fragment that is not fused to a phage coat protein) in such a way 

that the fragment is secreted into the periplasmic space of Escherichia coli. 

Genes of these fusion proteins are encoded on a phagemid vector, such as 

pBlueScript (Stratagene) under the control of a regulatable promoter. 

Alternatively, the phage genome can be used. The phagemid encoding N-FN3 
1 5 contains a drug resistance marker (such as ampicillin resistance), and the 

phagemid encoding C-FN3 contains a different marker (such as kanamycin 

resistance), so that they are easily separated. 

A binding pair can be added to N-FN3 and C-FN3 in such a way that the 

binding pair enhances the association of N-FN3 and C-FN3. This is done by 
20 fusing the gene for one component of the binding partner to the N-FN3 gene and 

the other to the C-FN3 gene using a flexible linker sequence (e.g., poly-Gly) 

between fused peptides. 

Combinatorial libraries of N-FN3 and C-FN3 in appropriate phagemids 

as described above, in which residues in a loop region are diversified (including 
25 insertions and deletions), are made using standard methods (Koide et al., 1998). 

Phagemid particles for N-FN3 and C-FN3 are separately generated using a helper 

phage as described (Koide et al., 1998). Subsequently, E. coli cells (such as 

XLl-blue, Stratagene) harboring an N-FN3 library are further infected with the 

phagemids encoding a C-FN3 library so that in a single E. coli cell both one (or 
30 more) clone of the N-FN3 library and one (or more) clone of the C-FN3 library 

coexists. Phagemid particles are then produced from these cells under conditions 

where N-FN3 and C-FN3 are expressed. N-FN3 and C-FN3 associate in the 
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periplasm of E. coli and thus phagemid particles display the reconstituted FN3 
representing one clone from the N-FN3 horary and one from the C-FN3 library. 
The phagenrid transaction process is very efficient so that one can construct a 
large library. 

5 Screening of displayed FN3 is performed using standard methods (Kay et 

al 1996- Koide et al., 1998). Note that a phagemid particle contains the gene 
for either N-FN3 or C-FN3, and thus it is necessary to recover at least two 
phagemid particles to identify the correct combination of N-FN3 and C-FN3 
variants with desired binding function. 

Recovered phages are amplified and again used to infect*, coli so that a 
single E. coli cell harbors both N-FN3 clone and C-FN3 clone. Phagermd 
particles are then produced as described above. After a few cycles of these 
selection and amplification processes, genes encoding the contiguous, foll-lengfo 
monobodies are constructed from the genes for N-FN3 and C-FN3 variants in the 
selected pool using PCR techniques and cloned into a phagemid vector. 
Standard phagemid selection experiments are then performed to identify , 
full-length monobodies with desired binding properties. 



10 



15 



b One vector system . 
0 AphagenudvectorexpressmgN-FNSfeedtoaphagecoatproteinanda 

secretion signal and C-FN3 fused to a secretion signal (or vise versa) nnder a 
single promoter is constructed. A recombinase recognition site such as 
wild-type lox is introduced in the intergenic region between the N-FN3 and 
C-FN3 genes. Another recombination site, which is orthogonal to the first one, 
,5 .uchasloxPSllism^oducedaftertheWoFNSfragrnentgenes. Examplesof 
such phagemid vectors have been described in the literature (Sblattero and 
Bradbury 2000; Sblattero et al. 2001). Mutations are introduced in a loop region 
within the N-FN3 gene using standard methods to generate a library of N-FN3. 
TothisensemMe of phagemid vectors encoding the library, finder mutanons ,n 
30 aloopregionofmeC-FNSgeneareintroduced. Phagemid particles are prepared 
torn mis ensemble of vectors encoding bom N-FN3 and CFN3 libraries usmg 
helper phages. E. coli cells that consotutively expresses an appropriate 
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recombinase, such as Cre recombinase, are infected with the phagemid particles 
with a high multiplicity of infection so that a single cell is infected with multiple 
phagemids. The recombinase in the E. coli cells recombine the phagemids at the 
recombination sites, thus creating further diversity. Phagemid particles are then 
5 produced from these cells and then used to infect another E. coli cell line that 
does not express the recombinase at a low multiplicity of infection. Phagemid 
particles are produced from these E. coli cells for library selection. Library 
selection and amplification of selected phagemids are performed using standard 
methods, except that the recombination step can be introduced to further increase 
10 the library diversity. 

2. Yeast two-hybrid 

A binding target ("bait") is fused to a DNA binding domain, and N-FN3 
(optionally with a component of a binding pair) is fused to an activation domain 

15 using standard methods (Golemis & Serebriiskii, 1997; Koide et al., 2002). C- 
FN3 (optionally with the other component of a binding pair) is expressed without 
fusing it to an activation domain under a strong promoter such as Gal so that it 
associates with the N-FN3-activation domain fusion. A library of N-FN3 is 
constructed in yeast cells of one mating type (e.g., the strain EGY48) and a 

20 library of C-FN3 is constructed in yeast cells of the other mating type (e.g., the 
strain KFY206). The bait plasmid is introduced in one of the yeast cells before 
constructing a library. The two yeast strains are mated and yeast two-hybrid 
screening is performed using standard methods as described previously (Golemis 
& Serebriiskii, 1997; Koide et al., 2002). Alternatively, C-FN3 can be fused to 

25 an activation domain and N-Fn3 can be expressed without fusing it to an 
activation domain. 

3. Yeast surface display. 

N-FN3 (optionally with a component of a binding pair) is fused to the 
30 Aga2 protein in such a way that it allows the surface display of N-FN3 (Boder & 
Wittrup, 1997; Boder & Wittrup, 2000). A vector such as pYDl (Invitrogen) is 
used for this purpose. C-FN3 (optionally with the other component of a binding 
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pair) * fused to an N-«erminal secretion sequence and ft. gene coding for this 
fusion protein is placed under an appropriate promoter such as GAL vector. 
MtemativeiyC-FrBcanbe displayed onthe yeast surface andN-FN3 canoe 
expressed without fusing it to Aga2. A library ofK-FN3 is constructed usmg 
5 Idard methods (Koide e, al„ 2002) in the yeast strain EBY100 (Invrtrogen), 
and a library of C-FN3 is constructed to me yeas, strain BJ5464 (ATCC) A 
collection of EBY100 cells containing aN-FN3 library and a collect of 
B,54McontainingaC-FN3 library are mated to produce diploid ceils each 

10 ^andC-FNJvanan.saremendispiayedonfteyeastsurface.andse.ect.ono^ 
Cones areperformed as described (Boder & Wtap, 1997 ; Boder & WtUrup, 

2000). , 

The Mowing section describes a specific example of yeast surface 

display. Theplasmid pYDFNl was construct, inserting me FN3 gene tnto 
, 5 pYD! vector OnvUrogen), and the gene is expressed under the Gal promoter. The 
LgenewaspreparedbyPCXandtatenninauoncodonoftheFNJg^ewas 

removedinsuchawaythattheFNS gene and the V5 tag sequence is in frame. 
N.FN3 (residues 1-42) was fused to the secretion signal of the Aga2 protem and 
^GtagmsuchawayAatitaUowsme — of N-FN3 foUowed by 

PCR and pYDFNl was used as a template. The secretion signal-N-FN3-FLAG 

is „amedpGalsecFN(N)FLAG. A yeast surface display vector for C-FN3 
(residues 43-94) was constructed from the plasmid pYDFNl . The DNA segment 
25 enc^ngtheExpresstagandresidues M2ofFN3wasde,etedbyPCRsothat„ 

pGalAgaFN(C)V5. Schemes of these vectors are shown in Figure 46. In 
addition,apGalAgaFN(C)V5 containing a FG-loop from a monobody 
"STAV1 1" that binds to streptavidin was constructed 
30 <pGalAgaFN(C)V5-STAV,l). The STAVH clone contains an FG loop sequence 
of HPMNEKN in place of the wild-type sequence, RGDSPAS. 
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Yeast EBY1 00 was transformed with the plasmid pGalsecFN(N)FLAG, 
and B J5464 was transformed with the plasmid pGalAgaFN(C) V5 or 
pGalAgaFN(C)V5-STAVl 1. These two strains were mated, and the diploid cells 
were analyzed using fluorescence activated cell sorter (FACS). 
5 Mated cells were grown in YC Glc ura- trp- leu- media, followed by YC 

Gal Raf ura- trp- leu- media in order to induce the expression of the fusion 
proteins. These media are according to Boder and Wittrup (Boder and Wittrup 
1997). Cells were spun down, washed with BSS (Tris-Cl pH7.4, NaCl, lmg/ml 
BSA). The cells were mixed with rabbit anti-FLAG antibody (Sigma) and 

10 monoclonal anti-V5 antibody (Sigma) in BSS and incubated on ice for 40 
minutes. The cells were spun down, washed with BSS, and mixed with 
anti-rabbit antibody-PE (Sigma) and anti-mouse antibody-FITC (Sigma) in BSS 
and incubated on ice for 40 minutes. The cells were spun down, washed with 
BSS, and subjected to a cell sorter (FACScanll, Beckton Dickinson). In this 

15 staining scheme, FITC fluorescence intensity indicates the amount of C-FN3 on 
the yeast surface and PE intensity indicates the amount of N-FN3 on the surface 
As shown in Figure 47, the FITC fluorescence monitoring the V5 epitope 
tag attached to C-FN3 was correlated to the expression of C-FN3 and 
C-FN3-STAV1 1 . The PE fluorescence monitoring the FLAG tag attached to 

20 N-FN3 was correlated to the expression of N-FN3 when C-FN3 was 
co-expressed. The surface display of N-FN3 was dependent on C-FN3 
expression, indicating that N-FN3 and C-FN3 reconstituted on the yeast surface. 
These results show that combinatorial libraries can be constructed from fragment 
libraries using yeast mating as described above. 

25 "Once specific pairs of N-FN3 and C-FN3 with desired binding properties 

are identified, genes encoding contiguous, full-length monobodies containing the 
identified loops sequences are constructed from the genes for the fragments. The 
genes for such full-length monobodies are cloned into vectors for library 
screening and/or into expression vectors, and these new vectors are used for 

30 further library screening and protein production. 
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The complete disclosure of all patents, patent documents and 
publications cited herein are incorporated by reference as if individually 
incorporated. The foregoing detailed description and examples havebeen given 
for clarity of understanding only. No unnecessary limitations are to be 
understood therefrom. The invention is not limited to the exact details shown 
and described for variations obvious to one skilled in the art will be included 
within the invention defined by the claims. 
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WHAT IS CLAIMED IS: 

1 . A fibronectin type III (Fn3) monobody binding pair comprising: 

(a) a first fibronectin type HI (Fn3) monobody polypeptide 
comprising two to six p-strand domains with a loop region linked 
between each p-strand domain, which optionally has a 
polypeptide tail region attached to one or both terminal P-strands, 
and 

(b) a second Fn3 monobody polypeptide comprising two to six p- 
strand domains with a loop region linked between each p-strand 
domain, which optionally has a polypeptide tail region attached to 
one or both terminal P-strands, 

wherein the first Fn3 fragment associates with the second Fn3 fragment 
with a dissociation constant of less than 1 CT 6 moles/liter. 

2. The binding pair of claim 1, wherein at least one loop region is capable 
of binding to a specific binding partner (SBP) to form a polypeptide: SBP 
complex having a dissociation constant, as measured in the binding 
reaction of the corresponding uncut, full-length monobody, of less than 
10* 6 moles/liter. 

3. The binding pair of claim 2, wherein a second loop region is capable of 
binding to a second specific binding partner (SBP-2), wherein the 
binding has a dissociation constant, as measured in the binding reaction 
of the corresponding uncut, full-length monobody, of less than 10" 6 
moles/liter. 

4. The binding pair of claim 1 , wherein at least one loop region is capable 
of catalyzing a chemical reaction with a catalyzed rate constant (k^, as 
measured in the binding reaction of the corresponding uncut, full-length 
monobody, and an uncatalyzed rate constant (Xmcat) such that ^ e rat i° °f 
kcaAuncat *s greater than 1 0, 
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5. 



UebindbgP-ofdaiml.whereinoneormoreof.heloopreeons 

comprise amino acid residues: 

i) from 1 5 to 16 inclusive in an AB loop; 

ii) from 22 to 30 inclusive in a BC loop; 
Hi) from 39 to 45 inclusive in a CD loop; 
iv ) from 51 to 55 inclusive in a DE loop; 

V ) from 60 to 66 inclusive in an EF loop; or 
vi) from 76 to 87 inclusive in an FG loop. 



Thebindingpair of claim 1, wherein aloop region varies from a 
corresponding 

amino : 



. acid infce.oopreg.on, insertion of oneto 25 amino acids, and/or 
replacement of at least one amino acid in to loop regum. 



* binding pair of claim 6, wherein tbe loop region varies torn a 
corresponding wild-type Fn3 loop region by deletion of one to aU except 
one amino acid and/or replacement of at leas, one amino ac.d. 

Tbebindtagpairofdaim^wheremfteloopregionvanesiroma 
' ^spending wild-type Fn3 loop region by insertion of one to 25 annuo 

acids. 

, Tbe binding pair of claim 1, wherein ft. first Fn3 polypeptide further 
' comprising a first auxiliary domain, and the second Fn3 polypept.de 

domain has a binding affinity for tie second auxiliary domain «d> a 
dissociation constant of less than Iff 5 moles/liter. 

a first cysteine and to second auxffiary region comprises a second 
cysteine, and wherem to firs, cysteine and to second cysteine form a 

disulfide bond. 
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1 1 . The binding pair of claim 9, wherein the auxiliary domains are a natural 
protein/peptide pair, a peptide-biiiding protein and its target peptide; or 
two fragments of a protein that have been artificially generated. 

12. The binding pair of claim 9, wherein the auxiliary domains are a pair of 
coiled coils or a C-intein and N-intein pair. 

13. The binding pair of claim 1, wherein the first polypeptide comprises a 
first cysteine and the second polypeptide comprises a second cysteine, 
and wherein the first cysteine and the second cysteine form a disulfide 
bond. 

14. The binding pair of claim 13 5 wherein the first cysteine is located in a 
loop region. 

15. The binding pair of claim 13, wherein the second cysteine is located in a 
loop region. 

16. The binding pair of claim 13, wherein the first cysteine is located in a 
beta-strand region. 

17. The binding pair of claim 13, wherein the second cysteine is located in a 
beta-strand region. 

18. A kit comprising the binding pair of claim 1 . 

19. A fibronectin type IE (Fn3) polypeptide monobody comprising a first and 
second Fn3 P-strand domain, and a first, second and third loop region, 
wherein the first P-strand domain is linked between the first and second 
loop regions, wherein the second p-strand domain is linked between the 
second and third loop regions, and wherein a unique peptide cleavage site 
exists in one of the loop regions. 
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20. TnepoiypepUdemonobodyofclaim 19, wherein me unique cleavage ate 
is in the second loop region. 

■ j wnWclnim 19 wherein the nucleic acid molecule 
21 The nucleic acid molecule ot claim iv, wu 

encodes the first or third monobody loop region that is varied as 
compared to the wild-type loop region by deletion of one to a.1 except 
oneamino acids in me loop region, insertion of at ieast one, o 25 ammo 
adds, and/or replacement of at .east one amino acid intheloop region. 

2 2 Thenucleicacidmoleculeofclaim 19, wherein the nucleic acid molecule 
' encodesatleastonelcopregioncapableofbindingtoaspecificbrndrng 

partner (SBP) to form a polypeptide^ complex having a dissociabon 
constant of less than 10"* moles/liter. 
23 Thenucleicaddmoleculeofc,aiml9,wheremmen»c.eica d dmolecule 

' encodes at ,east two loop regions capable of binding to a specific binding 
parmer(SBP).o&rmapolypeptide:SBPc 0 nmlex,eachcom P lexUvm g 

adissociation constant of less man 10* moles/liter. 

M isolated nucleic acid mdecule encoding a fibronecun type HI (Fn3) 
polypeptidemonob.dycomprismgafirstandsecondFnap-strand 

domain, and a ft*, second and third loop region, wherein the firs. P- 
strand domain is linked baween me first and second loop regions, 
wheremmesecondp.tranddomamislinkedbetweenO.e second and 

mirdloopregions, and wherem a unique peptide cleavage sitee^-n 

one of the loop regions. 

25 . The nucleic acid molecule of ciaim 24, wherein the unique deavage site 

is in the second loop region. 
26 Tnenucldcaddmo 1 ecmeofclaim24,whereinmenucleicacidmoleculc 
encodes the first or third monobody loop region that is varied as 



24. 
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compared to the wild-type loop region by deletion of one to all except 
one amino acids in the loop region, insertion of at least one to 25 amino 
acids, and/or replacement of at least one amino acid in the loop region. 

27. The nucleic acid molecule of claim 24, wherein the nucleic acid molecule 
encodes at least one loop region capable of binding to a specific binding 
partner (SBP) to form a polypeptide: SBP complex having a dissociation 
constant of less than 10* 6 moles/liter. 

28. The nucleic acid molecule of claim 24, wherein the nucleic acid molecule 
encodes at least two loop regions capable of binding to a specific binding 
partner (SBP) to form a polypeptide: SBP complex, each complex having 
a dissociation constant of less than 10" 6 moles/liter. 

29. An expression vector comprising an expression cassette operably linked 
to the nucleic acid molecule of claim 24. 

30. The expression vector of claim 29, wherein the expression vector is an 
Ml 3 phage-based plasmid. 

31. A host cell comprising the vector of claim 29. 

32. A method of preparing a fibronectin type III (Fn3) polypeptide monobody 
binding pair comprising the steps of: 

(a) providing a first fibronectin type m (Fn3) monobody polypeptide 
consisting of two to six p-strand domains with a loop region 
linked between each p-strand domain, and 

(b) providing a second Fn3 monobody polypeptide consisting of two 
to six p-strand domains with a loop region linked between each P~ 
strand domain, 
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wherein the firstFnS fragment associates with the second Fn3 fragment 
to form abinding pair with a dissociation constant of less than 10 

moles/liter. 

33. The method of claim 32, wherein one or more of the loop regions 
comprise amino acid residues: 

i) from 1 5 to 1 6 inclusive in an AB loop; 

ii) from 22 to 30 inclusive in a BC loop; 

iii) from 39 to 45 inclusive in a CD loop; 

iv) from 5 1 to 55 inclusive in a DE loop; 

v) from 60 to 66 inclusive in an EF loop; or 

vi) from 76 to 87 inclusive in an FG loop. 

34 The method of claim 32, wherein a loop region varies from a 

correspond^ngwild-typeloopre^onbydeletionofatleasttwotoallbut 

one ammo addintheloop region, insertion ofatleast two to 25 ammo 
acids, orreplacement of atleast two amino acids in the loop regton. 

35 The method of claim 34, wherein the loop region varies from a 
corresponding wild-type Fn3 loop region by deletion of one to all except 
one amino acid and/or replacement of at least one amino acid. 

36 The method of claim 34, wherein the loop region varies from a 
corresponding wild-typeFn3 loop region by insertion of from one to 25 

amino acids. 

37 The method of claim 32, wherein the first Fn3 polypeptide farther 
comprising a first auxiliary domain, and the second Fn3 polypeptide 
further comprises a second auxiliary domain, wherein me firs. anxiUary 
domain has a binding affinity for the second auxiliary domain wrth a 
dissociation constant of less than 10 s moles/liter. 
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38. The method of claim 37, wherein the first auxiliary region comprises a 
first cysteine and the second auxiliary region comprises a second 
cysteine, and wherein the first cysteine and the second cysteine form a 
disulfide bond. 

39. The method of claim 37, wherein the auxiliary domains are a natural 
protein/peptide pair, a peptide-binding protein and its target peptide; or 
two fragments of a protein that have been artificially generated. 

40. The method of claim 39, wherein the auxiliary domains are a pair of 
coiled coils or a C-intein and N-intein pair. 

41. The method of claim 39, wherein the first auxiliary region comprises a 
first cysteine and the second auxiliary region comprises a second 
cysteine, and wherein the first cysteine and the second cysteine form a 
disulfide bond. 

42. The method of claim 32, wherein the first polypeptide comprises a first 
cysteine and the second polypeptide comprises a second cysteine, and 
wherein the first cysteine and the second cysteine form a disulfide bond. 

43. The method of claim 42, wherein the first cysteine is located in a loop 
region. 

44. The method of claim 42, wherein the second cysteine is located in a loop 
region. 

45. The method of claim 42, wherein the first cysteine is located in a beta- 
strand region. 

46. The method of claim 42, wherein the second cysteine is located in a beta- 
strand region. 
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47. 



50. 



ftom 6to75n«c»eioao id ba S es iS i»s ate dintoa 1 oop reg .o I . 



48. A kit for performing the method of claim 32. 

49 . Amonobodyrnadebymemethodofclaim32. 



• lectin type nl(F° 3 )P ol «' eI>tidem0 ° 0b0dy 
A method of preparing a fibronecun type 

' Lar^acid^aioopregionflaniced.yanrstMp- 
^ddomainandasecondFnSP-a^d domain, wberernthe 

loop region contain, a nniqoe peptide cleavage srte; 
b) providmgasecondDNAmatencodesasecondammoacrd^ 

wherein me second amino acid cornpri.es a ioop regron flanked 

lemmeloopre^onconuinsaunio.nepepndecieavas^ 
e) mal.ngam— on fati e,oopregi„no f me fc to r ^ « 

Joacids.adeietionofonetoanexceptoneammoacrdand/or 

.ubstitation of one or more nucleic acrds; 
* expressingthefirstand^ndDNAmo.ecules.oy.elda 
" I—PO^ndemono.ody.wherem.efrrstmono^ 

.scciates^mesecondmonopodywiu.adi.octa.roncoas.n. 

less than 10' 6 M. 

5 , Akitforperforn^gmemen.o.ofdaim^comprismg^^^ 

second DNAs. 

52 . Avariega.ednncleicacidin.rary— F n3 polypeptide monooodies 
made by the method of claim 50. 
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53. The variegated nucleic acid library of claim 52, wherein the first or 
second loop region encodes: 

i) an AB amino acid loop from residue 15 to 16 inclusive; 

ii) a BC amino acid loop from residue 22 to 30 inclusive; 

iii) a CD amino acid loop from residue 39 to 45 inclusive; 

iv) a DE amino acid loop from residue 51 to 55 inclusive; 

v) an EF amino acid loop from residue 60 to 66 inclusive; or 

vi) an FG amino acid loop from residue 76 to 87 inclusive. 

54. The variegated nucleic acid library of claim 52, wherein the first or 
second loop regions vary from the wild-type Fn3 loop regions by deletion 
of one to all except one amino acid and/or replacement of at least one 
amino acids, 

55. The variegated nucleic acid library of claim 52, wherein the first or 
second loop regions vary from a corresponding wild-type Fn3 loop 
regions by insertion of from one to 25 amino acids. 

56. The variegated nucleic acid library of claim 52, wherein a nucleic acid of 
from 3 to 75 nucleic acid bases is inserted in the first or second loop 
region. 

57. The variegated nucleic acid library of claim 52, wherein the first or 
second loop is a BC loop. 

58. The variegated nucleic acid library of claim 52, wherein the first or 
second loop is a DE loop. 

59. The variegated nucleic acid library of claim 52, wherein the first or 
second loop is an FG loop. 
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-n.evariegaiednuc.eicacidHb^yofctoSZ^ereintefi^or 
second loop is an AB loop. 

61 . The variegated nucleic acid library of claim 52, whereto the first or 
second loop is a CD loop. 

O. mevariegatednucldcaddlibraryofclaimS^whereintheastor 
second loop an EF loop. 



63. A peptide library 
claim 52. 



derived torn the variegated nucleic acid library of 



M . TtepeptideHbraryofdairneS.whereir.themo.obodyi.expre^ 

using a yeast two-hybrid system. 
65 . THepeptidelibraryofclaimfiS.whereinthernonobodyisexpressei 

using a yeast surface display system. 
«. m e P ep ti delibraryofclaun63 > whe I ein tt >e m onobodyisdi S playedon 

the surface of a bacteriophage or virus. 
67 . Afi bronectintypera(Fn3)monobodypolypep«idebtodingpair 
comprising: 

W afirstFnSfragmentcomprisingafcstloopregronflankedbya 
tot Fn3 Ntrand domain and a second Fn3 P-sttand domam, 
wherein at least one p-strand domain is altered as compared to the 
corresponding wild-type Ntrand domain, and 
M a second Fn3 flagmen, comprising a second loop region flanked 
a third Fn3 p-strand domain and a fourth Fn3 p-strand domam, 
wherein at least one p-strand domain is altered as compared to Are 
corresponding wild-type p-strand domain, 
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wherein the first and second Fn3 fragments associate with a dissociation 
constant less than 10" 6 moles/liter. 

68. A fibronectin type m (Fn3) monobody polypeptide binding pair of claim 
63 wherein a monobody with an altered p-strand domain does not 
associate with a monobody comprising wild type Fn3 p-strand domains 
with a dissociation constant of less thanlO" 6 moles/liter. 

69. A method of reconstituting specific monobody polypeptide binding pairs 
comprising: 

(a) providing a binding pair as recited in claim 1 0 (Fn3 fragments X 
and Y), and 

(b) providing a binding pair as recited in claim 64 (Fn3 fragments W 
andZ), 

wherein fragments X and Y associate with each other with a dissociation 
constant of less than 1 0 -6 moles/liter and fragments W and Z associate 
with each other with a dissociation constant of less than 10" 6 moles/liter, 
but associations of XW, XZ, YW, or YZ are not formed with dissociation 
constants of less than 1 0" 6 moles/liter. 

70. A fibronectin type HI (Fn3) monobody polypeptide consisting of two to 
six p-strand domains with a loop region linked between each p-strand 
domain, and wherein the monobody polypeptide is capable of binding to 
a target molecule with a dissociation constant of less than 10" 6 
moles/liter. 

7 1 . The monobody of claim 70, wherein at least one loop region binds to the 
target molecule. 

72. The monobody of claim 70, wherein at least one loop region comprises 
amino acid residues: 

i) from 15 to 16 inclusive in an AB loop; 
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ii) 



73. 



75. 



from 22 to 30 inclusive in a BC loop; 
iii) from 39 to 45 inclusive in a CD loop; 
iY ) from 51 to 55 inclusive in a DE loop; 

from 60 to 66 inclusive in an EF loop; or 



v) 



vi) from 76 to 87 inclusive in an FG loop. 
Tfcemonobody of claim 70, wherein at leas. one. cop region varies from 

of at least wo amino acids in the loop region. 

74 InenronoWyofc^TO.whereinthelo.pre^onvariesfr^a 

^ndingwnd-typeFnBlccpregionhy^inserUonofftorntwoto 

25 amino acids. 



lend moaobody with a dissociation cons- of less than 10 

moles/liter. 

to the nucleic acid molecule of claim 76. 



77. 



78 . A host cell comprising the vector of claim 77. 
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Ndel 

CATATGCAGGTTTCTGATGTTCCGCGTGACCTGGAAGTTGTTGCTGCGACCCCGACTAGC 

MetGlnValSerAspValProArgAspLeuGluValValAlaAlaThrProTlirSer 
-2 -1 1 To A 

Bell PvuH PstI BC lOOP BsiWI 

CTGCTGATCAGCTGGGATGCTCcdsCAGTTACCGTGCGljrATTACCGTATCACGTACGGT 

LeuLeuIleSerTrpAspAlaPro lAlaValThrValAT-gyfryrTyrArg lleThrTvrGlv 
20 B 30 q ~ 

EcoRX 

GAAACCGGTGGTAACTCCCCGGTTCAGGAATTCACTGTACCTGGTTCCAAGTCTACTGCT 
Gj^iThrGlyGlyAsnSerPro ValGlnGluPheThrV^i proGlvSeirLv^ SerThrAl a 
40 D 50 E" 

Sail Bstll07I 

ACCATCAGCGGCCTGAAACCGGGTGTCGACTATACCATCACTGTATACGCTGTTACri|GGC 
Z^Elj^^erGlyLeuLysProGlyVa lAspTvrThrl^eThr^^ Gly 

FG lOOP Sad 3^ 

CGTGGTGACAGCCCAGCGAGCpCCAAGCCAATCTCGATTAACTACCGTACCTAGTAACTC 
ArgGlyAspSerProAlaSer S e r Lys ProIleSerll eAsnTyr Ar a Thr 
80 90 Q 
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Figure 25 
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Figure 38 A-F 
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SEQUENCE LISTING 

<U0> Research Corporation Technologies, Inc. 

5 Koide, Shohei 

<120> Reconstituted Polypeptides 

10<130> 109.054WO1 

<150> tJS 60/386,991 
<151> 2002-06-06 

15<160> 135 

<170 > FastSEQ for Windows Version 4.0 

<210> 1 
20<211> 14 
<212> PRT 

<213> Artificial Sequence 

<220> 
25<223> A peptide 



ru «. «, « « »p ^ °» - ° iy 

10 



i 

30 

<210> 2 
<211> 17 
<212> PRT 

<213> Artificial Sequence 

35 

<220> 

<223> A peptide 

,;r u - - « ^ ^ «u « «. ^ - 

l 5 
Gly 
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<220> 

<223> A peptide 



<400> 6 

5Tyr Ala Val Arg Asp Tyr Arg Leu Asp Tyr Ala Ser Ser Lys Pro He 
1 5 10 15 



<210> 7 
<211> 13 
10<212> PRT 

<213> Artificial Sequence 



<220> 

<223> A peptide 

15 

<400> 7 

Tyr Ala Val Arg Asp Tyr Arg Leu Asp Tyr Lys Pro He 
1 5 10 

20<210> 8 
<211> 11 
<212> PRT 

<213> Artificial Sequence 

25<220> 

<223> A peptide 

<400> 8 

Tyr Ala Val Arg Asp Tyr Arg Ser Lys Pro He 

<210> 9 
<211> 14 
<212> PRT 
35<213> Artificial Sequence 

<220> 

<223> A peptide 



40<400> 9 

Tyr Ala Val Thr Arg Asp Tyr Arg Leu Ser Ser Lys Pro He 
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<220> 

<223> An oligonucleotide 



sd-cc, ta tg = a9g t C tct g at g tt= c 9 = g t g a=c t gg a ag tt 3 tt g ct g c g acc 



<210> 14 
<211> 55 
<212> DNA 
10<213> Artificial Sequence 

<220> 

<223> An oligonucleotide 



15<400> 14 ^^^^ 
taactgcagg agcatcccag ctgatcagca ggctagtcgg ggtcgcagca acaac 



<210> 15 
<211> 51 
20<212> DNA 

<213> Artificial Sequence 

<220> 

<223> An oligonucleotide 

25 

<400> 15 

ctcctgcagt taccgtgcgt tattaccgta tcacgtacgg tgaaaccggt g 

<210> 16 
30<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

35<223> An oligonucleotide 



<400> 16 

gtgaattcct gaaccgggga gttaccaccg gtttcaccg 
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7 

<400> 20 

cgggatccga gctcgctggg ctgtcaccac ggccagtaac agcgtataca gtgat 55 

<210> 21 
5<211> 35 
<212> DNA 

<213> Artificial Sequence 
<220> 

10<223> An oligonucleotide 
<400> 21 

cagcgagctc caagccaatc tcgattaact accgt 35 

15<210> 22 
<211> 37 
<212> DNA 

<213> Artificial Sequence 

20<220> 

<223> An oligonucleotide 

<400> 22 

cgggatcctc gagttactag gtacggtagt taatcga 37 

25 

<210> 23 
<211> 38 
<212> DNA 

<213> Artificial Sequence 

30 

<220> 

<223> An oligonucleotide 
<400> 23 

35cgggatccac gcgtgccacc ggtacggtag ttaatcga 3 8 

<210> 24 

<211> 44 

<212> DNA 

40<213> Artificial Sequence 
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<210> 28 
<211> 57 
<212> DNA 

<213> Artificial Sequence 

5 

<220> 

<223> An oligonucleotide 

<221> misc_feature 
10<222> (1) . (57) 

<223> n = A, T, G or C 

tgtatacgct gttactggcn nknnlomknn Icnnknnknnk tccaagccaa tctcgat 

15 

<210> 29 
<211> 47 
<212> DNA 

<213> Artificial Sequence 

20 

<220> 

<223> An oligonucleotide 

<221> misc_feature 
25<222> (1) . - - (47) 

<223> n = A, T, G or C 

<400> 29 

ctgtatacgc tgttactggc nnknnknnkn nkccagcgag ctccaag 

30 

<210> 30 
<211> 51 
<212> DNA 

<213> Artificial Sequence 

35 

<220> 

<223> An oligonucleotide 

<221> misc_feature 
40<222> (1) . . - (51) 

<223> n = A, T, G or C 
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11 

<210> 34 
<211> 7 
<212> PRT 

<213> Artificial Sequence 

5 

<220> 

<223> Sequence of a ubiqui tin -binding monobody 

<400> 34 
10 Arg Trp Val Gly Leu Ala Trp 
1 5 

<210> 35 
<211> 5 
15<212> PRT 

<2-13> Artificial Sequence 

<220> 

<223> Sequence of a ubiquitin-binding monobody 

20 

<400> 35 

Cys Lys His Arg Arg 
1 5 

25<210> 36 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
30<220> 

<223> Sequence of a ubiquitin-binding monobody 
<400> 36 

Phe Ala Asp Leu Trp Trp Arg 
35 1 5 

<210> 37 
<211> 5 
<212> PRT 
40<213> Artificial Sequence 
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13 



<210> 41 
<211> 5 
<212> PRT 

<213> Artificial Sequence 

5 

<220> 



of a ubiquitin-binding monobody 
<223> Sequence of a uciqui 



<400> 41 
lOSer Arg Leu Arg Arg 
1 5 

<210> 42 
<211> 5 
15<212> PRT 

<213> Artificial Sequence 



2231 Sequence of a ubiquitin-binding monobody 



<220> 
< 
20 

<400> 42 

Pro Pro Trp Arg Val 
1 5 

25<210> 43 
<211> 5 
<212> PRT 

<213> Artificial Sequence 



30<220> e nr. of a ubiquitin-binding monobody 
<223> Sequence or a uuxyuo. 



<400> 43 

Ala Arg Trp Thr Leu 
35 1 5 



<210> 44 
<211> 5 
<212> PRT 
40<213> Artificial Sequence 
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15 

<210> 48 
<211> 7 
<212> PRT 

<213> Artificial Sequence 

5 

<220> 

<223> A clone from Library #2 

<400> 48 
lOArg Gly Asp Ser Pro Ala Ser 
1 5 



<210> 49 
<211> 5 
15<212> PRT 

<213> Artificial Sequence 

<220> 

<223> A clone from Library #2 

20 

<400> 49 

Cys Asn Trp Arg Arg 
1 5 



25<210> 50 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
30<220> 

<223> A clone from Library #2 
<400> 50 

Arg Ala Tyr Arg Tyr Arg Trp 
35 1 5 

<210> 51 
<211> 5 
<212> PRT 
40<213> Artificial Sequence 
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17 

<210> 55 
<211> 5 
<212> PRT 

<213> Artificial Sequence 

5 

<220> 

<223> A clone from Library #2 

<400> 55 
lOCys Ala Arg Arg Arg 
1 5 

<210> 56 
<211> 7 
15<212> PRT 

<213> Artificial Sequence 

<220> 

<223> A clone from Library #2 

20 

<400> 56 

Arg Arg Ala Gly Trp Gly Trp 



25<210> 57 
<211> 5 
<212> PRT 

<213> Artificial Sequence 
30<220> 

<223> A clone from Library #2 
<400> 57 

Cys Asn Trp Arg Arg 



<210> 58 
<211> 7 
<212> PRT 
40<213> Artificial Sequence 
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<210> 62 

<211> 7 

<212> PRT 

<213> Artificial Sequence 

5 

<220> 

<223> A clone from Library #2 

<400> 62 
lOArg Ala Tyr Arg Tyr Arg Trp 
1 5 

<210> 63 
<211> 5 
15<212> PRT 

<213> Artificial Sequence 

<220> 

<223> A clone from Library #2 

20 

<400> 63 

Glu Arg Arg Val Pro 
1 5 

25<210> 64 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
30<220> 

<223> A clone from Library #2 
<400> 64 

Arg Leu Leu Leu Trp Gin Arg 
35 1 5 



<210> 65 
<211> 5 
<212> PRT 
40<213> Artificial Sequence 
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<210> 69 
<211> 5 
<212> PRT 

<213> Artificial Sequence 

5 

<220> 

<223> A clone from Library #2 

<400> 69 
lOCys Asn Trp Arg Arg 
1 5 

<210> 70 
<211> 7 
15<212> PRT 

<213> Artificial Sequence 

<220> 

<223> A clone from Library #2 

20 

<400> 70 

Arg Ala Tyr Arg Tyr Arg Trp 
1 5 

25<210> 71 
<211> 5 
<212> PRT 

<213> Artificial Sequence 
30<220> 

<223> A clone from Library #2 
<400> 71 

Ala Val Thr Val Arg 
35 1 5 

<210> 72 
<211> 5 
<212> PRT 
40<213> Artificial Sequence 
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<210> 76 
<211> 5 
<212> PRT 

<213> Artificial Sequence 

5 

<220> 

<223> A clone from Library #2 

<400> 76 
lOArg Arg Trp Trp Ala 

1 * ■ 5 

<210> 77 
<211> 5 
15<212> PRT 

<213> Artificial Sequence 

<220> 

<223> A clone from Library #2 

20 

<400> 77 

Gly Gin Arg Thr Phe 
1 5 

25<210> 78 
<211> 5 
<212> PRT 

<213> Artificial Sequence 
30<220> 

<223> A clone from Library #2 
<400> 78 

Arg Arg Trp Trp Ala 
35 1 5 

<210> 79 
<211> 5 
<212> PRT 
40<213> Artificial Sequence 
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<210> 83 
<211> 5 
<212> PRT 

<213> Artificial Sequence 

5 

<220> 

<223> A clone from Library #2 

<400> 83 
lOGly Gin Arg Thr Phe 



<210> 84 
<211> 5 
15<212> PRT 

<213> Artificial Sequence 

<220> 

<223> A clone from Library #2 

20 

<400> 84 

Arg Arg Trp Trp Ala 



25<210> 85 
<211> 5 
<212> PRT 

<213> Artificial Sequence 
30<220> 

<223> A clone from Library #2 
<400> 85 

Leu Arg Tyr Arg Ser 



<210> 86 
<211> 5 
<212> PRT 
40<213> Artificial Sequence 
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<210> 90 
<211> 5 
<212> PRT . : 

<213> Artificial Sequence 

5 

<220> 

<223> A clone from Library #2 

<400> 90 
lOArg Arg Trp Trp Ala 
1 5 



<210> 91 
<211> 5 
15<212> PRT 

<213> Artificial Sequence 

<220> 

<223> A clone from Library #2 

20 

<400> 91 

Leu Arg Tyr Arg Ser 
1 5 

25<210> 92 
<211> 5 
<212> PRT 

<213> Artificial Sequence 
30<220> 

<223> A clone from Library #2 
<400> 92 

Gly Trp Arg Trp Arg 
35 1 5 

<210> 93 
<211> 15 
<212> DNA 
40<213> Artificial Sequence 
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<210> 97 
<211> 15 
<212> DNA 

<213> Artificial Sequence 

5 



<220> • rhe variegated loops of enriched clones 

<223> Sequence in the variegat 



<400> 97 
lOtcgaggttgc ggcgg 

<210> 98 
<211> 5 
<212> PRT 
15<213> Artificial Sequence 

<220> 



the variegated loops of enriched clones 
<223> Sequence m the varieyctu 



20<400> 98 

Ser Arg Leu Arg Arg 
5 



1 



<210> 99 
25<211> 15 
<212> DNA 

<213> Artificial Sequence 



21 S^ence in the v«ie g a«* loops ot evened c« 



<220> 

30 



<400> 99 

ccgccgtgga gggtg 

35<210> 100 
<211> 5 
<212> PRT 

<213> Artificial Sequence 



Zl i« - va^.ted -ops of -ri— Clones 



40<220> 
< 
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<220> 

<223> Sequence in the variegated loops of enriched clones 

<400> 104 
5Arg Arg Trp Trp Ala 
1 5 

<210> 105 
<211> 15 
10<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Sequence in the variegated loops of enriched clones 

15 

<400> 105 

gcgaggtgga cgctt 15 

<210> 106 
20<211> 5 
<212> PRT 

<213> Artificial Sequence 
<220> 

25<223> Sequence in the variegated loops of enriched clones 
<400> 106 

Ala Arg Trp Thr Leu 
1 5 

30 

<210> 107 
<211> 15 
<212> DNA 

<213> Artificial Sequence 

35 

<220> 

<223> Sequence in the variegated loops of enriched clones 



<400> 107 
40aggcggtggt ggtgg 



15 
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Gly v,l *sp Tyr Thr lie Thr val Tyr Ma val Thr Gly « g Gly »P 

sL Pro Ala Ser Ser lys Pro He Ser lie abb Tyr Arg Thr 



85 90 
5 

<210> HI 
<211> 300 
<212> DNA 

<213> Artificial Sequence 

10 

<220> 

<223> A designed Fn3 gene 

<400> HI ^r 0a r±aac 60 

120 



U ~"l^> ttt=t g at gt tcc g c g t g ac ct gg aa 9 t tg tt g ct g c g ac ccc g a Ct , g = 

Jacobs gtaactcccc gg «=a gg a. ttcact g ta= ct gg tt=c.a 9tctact g ct 
gMaccggcg a „ 0[rt . atc aac tataocatca ct g tatacgc tgttact gg c 

accatca g = g g cct g aaacc ggg t g tcgac tacac s ^„ t ,, ct c 

c g t ggt =aca g ccca g = g . 9 =tc=a, g cca atctcgatt. a=tacc g tac cta g taa=tc 

20 

<210> 112 
<211> 98 
<212> PRT 

<213> Artificial Sequence 

25 

<220> 

<223> A designed Fn3 gene 

3„;:r; l "va 1 ser ^ «. », ». «. ~ * - - - 

5 10 

1 Tlo c^r Tro Asp Ala Pro Ala Val Thr Val 

Thr Pro Thr Ser Leu Leu He Ser Trp Asp ax 

25 30 
«g Tyr Tyr «g H« Thr Tyr Gly 01« Thr sly Gly Asn ser Pro val 

Glu L Thr val Pro Gly sir Lys Ser Thr *1, Leu Thr He Ser 

G1 y L Lys Pro Gly val Tyr Thr He Thr Val Tyr U. val Thr 

,0Gly «, Gly ASP s.r Pro Ma ser Ser Lys 11 He Ser He « Tyr 

90 9b 
85 yU 

Arg Thr 



180 
240 
300 
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<400> 116 

Leu Arg Tyr Arg Ser Gly Tip Arg Trp Arg 
15 10 



5<210> 117 
<211> 12 
<212> PRT 

<213> Artificial Sequence 



10<220> 

<223> A peptide 



<400> 117 

Cys Asn Trp Arg Arg Arg Ala Tyr Arg Tyr Trp Arg 
15 1 5 10 



<210> 118 
<211> 12 
<212> PRT 
20<213> Artificial 



Sequence 



<220> 

<223> A peptide 



25<400> 118 

Ala Arg Met Arg Glu Arg Trp Leu Arg Gly Arg Tyr 
1 5 10 

<210> 119 
30<211> 4 
<212> PRT 

<213> Artificial Sequence 



<220> 
35<223> A peptide 

<400> 119 
Glu lie Asp Lys 
1 
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<400> 122 

Gly Gly Met Gly Gly 
1 5 



5<210> 123 
<211> 96 
<212> PRT 
<213> Artificial 

10<220> 

<223> A peptide 



Sequence 



<400> 123 

Val Ser Asp Val Pro Thr Asp Leu Glu Val Val Ala Ala Thr 



Met Gin 



10 is 



15 1 5 

Tli , Car . -p.— astl Ala Pro Ala Val Thr Val Arg 
Pro Thr Ser Leu Leu He Ser Trp Asp aj-ci ^ 

" 25 3° 



Tyr Tyr Arg He Thr Tyr Gly Glu Thr Gly Gly Asn Ser Pro Val Gin 
35 

20 Glu Phe Thr Val Pro Gly Ser Lys Ser Thr Ala Thr He Ser Gly Leu 

55 60 
Lys Pro Gly Val Asp Tyr Thr lie Thr Val Tyr Ala Val Thr Gly Arg 



70 " 80 



Gly Asp Ser Pro Ala Ser Ser Lys Pro He Ser He Asn Tyr Arg Thr 

2S 



<210> 124 
<211> 6 
<212> PRT 
30<213> Artificial Sequence 



<220> 

<223> A peptide 

35<221> SITE 
<222> 6 

<223> Xaa = homoserine lactone 



<400> 124 
40Gly Gly Asn Gly Gly Xaa 
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<220> 

<223> An oligonucleotide for the construction of plasmids for monobody 
reconstitution 

5<400> 128 

gtggtaccgg tggttcccct ccaaaaaaga agagaag 

<210> 129 
<211> 37 
10<212> DNA 

<213> Artificial Sequence 

<220> 

<223> An oligonucleotide for the construction of plasmids for monobody 
15 reconstitution 

<400> 129 

ggaaccaccg gtaccaccgg tacggtagtt aatcgag 

20<210> 130 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
25<220> 

<223> An oligonucleotide for the construction of plasmids for monobody 
reconstitution 

<400> 130 
3 0taatacgact cactataggg 

<210> 131 
<211> 31 
<212> DNA 
35<213> Artificial Sequence 

<220> 

<223> An oligonucleotide for the construction of plasmids for monobody 
reconstitution 

40 

<400> 131 

ccgactcgag ttaatctcca ctcagcaaga g : 
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5 ;r;" ly G 1V «» V.1 Cys V,l M5 «P Ser ~P « 

1 5 

Ser Trp Leu Lys Pro 
20 
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